Probabilistic observables, conditional correlations, and quantum physics 
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We discuss the classical statistics of isolated subsystems. Only a small part of the information 
contained in the classical probability distribution for the subsystem and its environment is available 
for the description of the isolated subsystem. The "coarse graining of the information" to micro- 
states implies probabilistic observables. For two-level probabilistic observables only a probability for 
finding the values one or minus one can be given for any micro-state, while such observables could 
be realized as classical observables with sharp values on a substate level. For a continuous family 
of micro-states parameterized by a sphere all the quantum mechanical laws for a two-state system 
follow under the assumption that the purity of the ensemble is conserved by the time evolution. The 
correlation functions of quantum mechanics correspond to the use of conditional correlation functions 
in classical statistics. We further discuss the classical statistical realization of entanglement within 
a system corresponding to four-state quantum mechanics. We conclude that quantum mechanics 
can be derived from a classical statistical setting with infinitely many micro-states. 



I. INTRODUCTION 

Quantum statistics is often believed to be fundamentally 
different from classical statistics. In quantum statistics, the 
complex probability amplitudes and transition amplitudes 
play a key role. Probabilities only obtain as squares of 
the amplitude, and this gives rise to spectacular phenom- 
ena as interference and entanglement. In contrast, classical 
statistics is directly formulated in terms of positive prob- 
abilities. Furthermore, unitarity is the most characteristic 
feature of the time evolution in quantum mechanics. This 
aspect is not easily visible in the time evolution of classical 
probabilities. Finally, the quantum mechanical uncertainty 
principle is based on the non-commutativity of the opera- 
tor product, while the pointwise product of observables in 
classical statistics is obviously commutative. 

We argue that the difference between quantum statistics 
and classical statistics is only apparent. We demonstrate 
that quantum mechanics can be described as a classical 
statistical system with infinitely many states. In this pa- 
per we mainly concentrate on a system that is equivalent to 
a two-state quantum system. We consider discrete observ- 
ables that can only take values ±1. They correspond to the 
spin operators of the equivalent quantum system. We will 
obtain the characteristic features of non-commuting spin 
operators in a classical setting. All the usual uncertainty 
relations of quantum mechanics are directly implemented. 
We also generalize our setting to a classical ensemble which 
is equivalent to four-state quantum mechanics. This allows 
us to describe the classical statistical realization of entan- 
glement and interference. 

We formulate the condition for the time evolution of our 
simplest classical ensemble that leads to the unitary trans- 
formations characteristic for the quantum evolution. It in- 
volves the concept of purity of a statistical ensemble. A 
purity conserving time evolution in classical statistics is 
equivalent to the unitary time evolution in quantum me- 
chanics. Pure classical states are those where one of the dis- 
crete observables takes a sharp value, say -1-1. This means 
that the probability vanishes for all states where the value 



of the observable takes a value different from one. Pure 
classical states correspond to pure quantum states and can 
be described by a wave function. We derive the von Neu- 
mann and Schrodinger equations for these states. 

We take here the attitude that the basic description of 
reality should be probabilistic, while an (almost) determin- 
istic behavior arises only in limiting cases. In particular, we 
do not attempt a deterministic local hidden variable the- 
ory. However, even on the level of a probabilistic theory 
it is widely believed that quantum mechanics needs sta- 
tistical concepts beyond classical statistics, while we argue 
here that the classical statistical concepts are sufficient and 
quantum mechanics can emerge from a classical statistical 
probability distribution. 

In particular a classical statistical setting admits the def- 
inition of different conditional correlation functions for the 
description of the outcome of sequences of measurements. 
This takes into account that measurements can change the 
state of the system, and that this change may depend on 
the type of measurement. For a two-state system only one 
particular conditional correlation function allows predic- 
tions which use only the information available within the 
two-state system, without invoking information from the 
environment. This conditional correlation differs from the 
classical correlation which is based on joint probabilities. 

The reader may cast strong doubts about these state- 
ments from the beginning. The big conceptual puzzles of 
quantum mechanics, as the Einstein-Rosen-Podolski para- 
doxon [l[ , have triggered a lot of attempts to replace quan- 
tum mechanics by a more fundamental deterministic the- 
ory. Based on Bells inequalities for correlators of en- 
tangled states it was argued that such attempts cannot 
succeed, since quantum correlations contradict either real- 
ism or locality. We will argue in the last section that both 
locality and a version of "probabilistic realism" , where the 
elements of reality can be described by correlation func- 
tions as well as values of observables, can be maintained. 
However, the classical statistical systems which describe 
quantum systems miss another property that is usually 
implicitly assumed in the derivation of Bell's inequality. 
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namely the property of "completeness" of the statistical 
system. Here a statistical system is called complete if joint 
probabilities for the values of all pairs of observables are 
defined and if the "measurement correlation" for pairs of 
observables is determined by those joint probabilities. We 
argue that the subsystems which show quantum mechani- 
cal properties are described by "incomplete statistics" [3| 
in this sense. This is closely related to a "coarse graining 
of the information" if one concentrates on properties only 
involving the subsystem. 

It can be shown @, Q that Bells inequalities 

apply if two measurements are appropriately described by 
the classical correlation function {A ■ B) for two observ- 
ables A and B. The "classical" or "pointwise" correlation 
function {A ■ B) means that the observables A and B have 
both fixed values in a state of the ensemble, which are mul- 
tiplied and averaged over the ensemble. Such systems are 
statistically complete. Deterministic local "hidden vari- 
able theories" usually assume this property. While Bell's 
inequalities indeed exclude such deterministic local hidden 
variable theories, they are not necessarily in contradiction 
with a classical statistical formulation of quantum mechan- 
ics which employs conditional correlations. 



II. OUTLINE 

Our classical statistical implementation of two-state and 
four-state quantum mechanics is based on four main ingre- 
dients: (i) The quantum system is described by an isolated 
subsystem of a classical statistical ensemble with infinitely 
many degrees of freedom. It can be characterized by a re- 
stricted set of probabilistic observables. (ii) The state of 
the subsystem is determined by the expectation values of 
a set of "basis observables" . (iii) Conditional correlations, 
which can be computed from the state of the subsystem, de- 
scribe the outcome of sequences of measurements, (iv) The 
unitary time evolution of the subsystem is implemented by 
particular properties of the time evolution of the probabil- 
ity distribution of the classical ensemble. 

The concept that is perhaps least familiar is the use of 
conditional correlations in a classical statistical ensemble. 
Indeed, within classical statistics one can define correlation 
functions different from the classical or pointwise correla- 
tion. We advocate that the outcome of two measurements 
of the observables A and B should not be described by 
the pointwise correlation {A - B), but rather by conditional 
correlations {Ao B), which are related to a different prod- 
uct structure Ao B. This can best be understood for two 
subsequent measurements. In general, the first measure- 
ment "changes the state of the system" by eliminating all 
possible sequences of events which contradict this measure- 
ment. The second measurement is performed with new 
conditions, depending on the outcome of the first measure- 
ment. The idealized situation where the effect of the first 
measurement on the state of the system can be neglected 
may be realized for some large systems as classical ther- 
modynamics, but not for systems with only a few effective 
degrees of freedom, as often characteristic for those de- 



scribed by quantum theory. If the outcome of the first 
measurement matters for the second, conditional proba- 
bilities should be used. Conditional correlations require a 
specification of the state of the system after the first mea- 
surement. The underlying elimination of possibilities con- 
tradicting the first measurement is not unique, however. 
We argue that for proper measurements of properties of 
the (sub-) system only information available for the sys- 
tem should be employed for the specification of the state 
after the first measurement. The details of the state of the 
environment should not matter. This excludes the use of 
the classical correlation function, except for certain special 
or limiting cases. 

We define within classical statistics a "conditional prod- 
uct" Ao B oi two observables, and the associated "condi- 
tional correlations" . They only involve information avail- 
able for the subsystem. We show how to express the 
conditional correlation functions in terms of quantum me- 
chanical operator products. The conditional correlations 
for classical probabilistic observables equal the appropriate 
conditional correlations defined in quantum mechanics. If 
the correct conditional correlations are used for a descrip- 
tion of two consecutive measurements, we obtain the same 
results for the classical statistics and the quantum descrip- 
tion. No conflict with Bells inequalities arises for the clas- 
sical statistics implementation of quantum mechanics. We 
propose that the conditional correlation or "quantum cor- 
relation" should be used for the general description of two 
measurements in classical statistics, and not only if two 
measurements are clearly separated in time. The perhaps 
more familiar classical correlation arises from the quantum 
correlation only for appropriate limiting cases. 

We do not consider the present work as only a formal 
or mathematical reformulation of quantum mechanics. We 
have a rather physical picture in mind, where the classical 
statistical setting describes an atom simultaneously with 
its environment. This is analogous to the role of an atom 
in quantum field theory, where it appears as a particular 
excitation of a highly complicated vacuum - its "environ- 
ment" . Quantum mechanical properties can arise when the 
statistical description focuses on the atom and discards all 
information pertaining to the environment. Typically, the 
state of the atom is described by only a few quantities. 
In the simplest case of an atom with spin one half in the 
ground state, the (sub-) system will be described by only 
three real numbers pk (neglecting the motion of the atom 
in space and excited energy levels). One therefore is inter- 
ested in possible observables - and structures among them 
- that can be described in terms of the reduced information 
contained in pk, rather than involving the full information 
contained in the classical probability distribution which de- 
scribes both the system and its environment. 

The embedding of the quantum mechanical concepts 
within a more general classical statistical setting, which 
also includes the environment, permits us to ask new ques- 
tions. What are the particular conditions for a quantum 
mechanical description to hold? Why do we observe all 
small enough systems in nature as quantum systems? We 
advocate that the unitary time evolution in quantum me- 
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chanics expresses the "isolation" of the subsystem from its 
environment. (This does not mean that the environment 
can simply be omitted from the classical description of the 
subsystem.) Interactions with the environment can lead 
to the phenomena of decoherence or "syncoherence" - the 
approach of a mixed state to a pure state. 

In this paper we present a rather detailed account of 
the classical statistics description of a quantum mechani- 
cal two-state system. We investigate discrete observables 
which can only take the values ±1. A useful notion is the 
concept of "probabilistic observables" which are character- 
ized by a probability to find the value -1-1 or —1 in every 
micro-state, rather than by a fixed value in a given micro- 
state. Such a generalized notion of "fuzzy observables" [1] , 
is well known in measurement theory p^. 

Probabilistic observables can be implemented as classi- 
cal observables if the micro-state consists of several or even 
infinitely many substates. In other words, once part of the 
degrees of freedom of a classical statistical system - the 
substates - are "integrated out" , a classical observable on 
the substate level becomes a probabilistic observable on the 
level of the remaining micro-states. The quantum mechani- 
cal pure and mixed states will be associated with particular 
micro-states. A typical observable may have a sharp value 
for particular micro-states, but typically a probability dis- 
tribution of different values in the other micro-states. 

The general notion of a probabilistic observable is a 
rather wide concept. One has to specify the amount of 
information available, in particular concerning joint proba- 
bilities for pairs of two probabilistic observables. In our ap- 
proach this is adapted to the picture of a subsystem within 
a classical statistical "environment". On the level of the 
micro-states two observables A and B are, in general, not 
comeasurable. This means that the joint probability Pat of 
finding the value a for A and b for B is not defined. ^ 

The notion of micro-states is introduced in this paper 
partly for purposes of gaining intuition. It demonstrates in 
a simple way how the notion of probabilistic or fuzzy ob- 



In this context it may be interesting to compare our setting for 
probability distributions and observables to the so called "classical 
extension" of quantum statistical systems , [Till . This mathemat- 
ical approach shows certain similarities, but also important differ- 
ences to our setting. For a "classical extension" one constructs a 
probability distribution for classical states which correspond to the 
pure quantum states [T^ , and one extends the notion of observables 
such that two observables which correspond to non-commuting op- 
erators in quantum mechanics become comeasurable. (As an ex- 
ample, the extension provides the joint probability for two spin 
components Sx and to have both the value up or +1. Such a 
joint probability is not available in quantum mechanics.) In our 
approach, comeasurability is not given on the level of micro-states, 
even though we may construct on this level probability distribu- 
tions on the manifold of pure quantum states. Instead, we can 
realize comeasurability on the level of the classical substates which 
describe the system and its environment. However, only a subman- 
ifold of the possible probability distributions for the substates can 
be mapped to the quantum states. (It may nevertheless be possible 
that one can formally construct a "classical extension" also in our 
approach - so far this does not play a role for the description of the 
system.) 



servables, which is characteristic for observables in a quan- 
tum state, arises naturally in a classical statistical setting. 
Alternatively, our approach could proceed directly from 
classical states with fixed values of classical observables 
(the substates in this paper) to the quantum system. This 
procedure is followed in refs. [3, EBli where the math- 
ematical structures underlying our concepts are discussed 
in more detail. In ref. we also give simple explicit real- 
izations of the classical statistical ensembles which describe 
the quantum system and its environment. In a formulation 
based on the substates the notion of micro-states needs not 
to be introduced and one can proceed directly to the quan- 
tum states of the subsystem. 

The setting is then simply classical statistics with all 
classical observables taking fixed values in all states. Nev- 
ertheless the notion of micro-states and probabilistic ob- 
servables may be considered as a useful way to organize 
the statistical information for certain specific systems. The 
mapping from the classical observables on the substate 
level to the probabilistic observables on the micro-state 
level is not invertible. We will see that the operators in 
quantum mechanics correspond to the probabilistic observ- 
ables. Due to the lack of invertibility no map which asso- 
ciates to each quantum operator a classical observable ex- 
ists. For this reason the Kochen-Specker-theorem [l^ does 
not apply. On the other hand, the implementation of prob- 
abilistic observables as classical observables on the level of 
substates is actually not necessary. One may, alternatively, 
treat the probabilistic observables as genuine objects of a 
classical statistical description of reality. In this version the 
Kochen-Specker theorem finds no application because the 
classical observables do not have fixed values. A more de- 
tailed discussion of the properties of observables and states 
can be found in [l3|, 

Beyond a statistical setting for states and observables 
other key features of quantum mechanics have to be im- 
plemented in a classical statistical setting. This concerns, 
first of all, a prescription for predictions of the outcome of 
two (or more) measurements - an issue related to the con- 
cept of correlation and discussed extensively in this paper. 
Many physicists believe that a proper probabilistic setting 
for quantum mechanical observables is not the central dis- 
tinction between classical statistics and quantum mechan- 
ics, but rather the issue of correlations. We share this 
opinion. Furthermore, one has to understand the quantum 
mechanical time evolution, starting from the time evolu- 
tion of a classical probability distribution. This requires 
an explicit construction of the density matrix in terms of 
the classical probability distribution. At the end, the issue 
will be to understand how quantum mechanical behavior 
emerges for physical systems within a more general classi- 
cal statistical setting. 

The classical statistical description of quantum mechan- 
ics can be generalized to systems with more than two quan- 
tum states. In a four-state system we have described the 
phenomenon of entanglement between two two-state sub- 
systems [l^ . Entanglement is often believed to be the cen- 
ter piece of quantum statistics. It is a central issue of many 
theoretical discussions about the foundations of quantum 
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mechanics, as decoherence [17[ or the measurement process 
[3, and it underHes the idea of quantum computing [191]. 
Spectacular experiments on teleportation [lO] rely on it. 
A classical statistics description of entanglement may find 
useful practical applications and influence the conceptual 
and philosophical discussion based on this phenomenon. 
We give a short account of four-state quantum mechanics 
in the later part of this paper. There is no limitation in the 
number of quantum states M . Taking M — > oo yields con- 
tinuous quantum mechanics. In particular, the quantum 
particle in a potential has been described by a classical 
statistical ensemble in this way 

This paper is organized as follows. In sect. Ill we dis- 
cuss the notion of probabilistic observables, with particular 
emphasis on two-level observables that may also be called 
spins. Sect. IV compares the realization of rotations in 
the classical and quantum statistical setting. The classical 
system needs an infinity of micro-states if a continuous ro- 
tation is to be realized. At least one "classical pure state" 
must exist for every rotation-angle. In sect. V we reduce 
the infinity of classical micro-states to a finite number of 
"effective states". All expectation values of the spin ob- 
servables can be computed in the effective state description 
in terms of three numbers pk- The prize of the reduction 
is, however, that the "effective probabilities" pk are not 
necessarily positive anymore. In sect. VI we construct the 
density matrix p of quantum mechanics from the effective 
probabilities. We establish for our classical statistical en- 
semble the quantum mechanical rule for the computation 
of expectation values of observables, {A) = tr{Ap). 

Sect. VI turns to the issue of correlation functions and 
introduces the conditional product of two observables and 
the conditional correlation. The conditional two point 
function is commutative. This does not hold for the higher 
conditional correlation functions - the three point function 
is not commutative since the order of consecutive measure- 
ments matters. Sect. VII completes the mapping between 
classical statistics and quantum statistics. We introduce 
the wave function for pure states and relate conditional 
probabilities to squares of quantum mechanical transition 
amplitudes. We then derive the expression of the condi- 
tional correlations in terms of quantum mechanical oper- 
ator products. The non-commutativity of the conditional 
three point function can be directly traced to non- vanishing 
commutators of operators. The derivation of these results 
demonstrates the use of quantum mechanical transition 
amplitudes for questions arising in a classical statistical 
setting. A simple example for a classical ensemble with a 
finite number of degrees of freedom is presented in sect. 
IIXI It describes three cartesian spins without a continuous 
rotation symmetry. 

In sect. |X]we deal with the time evolution. We define 
the purity P of a statistical ensemble - in our simplest case 

P = ^p2 =2trp2_l. (1) 
fe 

A purity conserving time evolution amounts to the uni- 
tary time transformation of quantum mechanics. Further- 
more, a more general time evolution of the classical ensem- 



ble can describe decoherence for decreasing purity, as well 
as "syncoherence" for increasing purity. We argue that 
the quantum mechanical pure states correspond to partial 
fixed points of the more general time evolution in classical 
statistics. In sect. IXII we briefiy discuss classical systems 
with a finite number of states N , which may be used to 
obtain quantum mechanics in the limit iV — > oo. Sect. 
Kill discusses the possible realizations of probabilistic ob- 
servables. We generalize in sect. IXIIII our construction to 
classical ensembles that correspond to four-state quantum 
mechanics. We show the classical realization of entangle- 
ment, interference and the distinction between bosons and 
fermions. Finally, we summarize in sect. IXIVI the concep- 
tual issues of realism, locality and completeness for statisti- 
cal systems and discuss Bell's inequalities. Our conclusions 
are presented in sect. IXVI 

III. PROBABILISTIC OBSERVABLES 

In this section we discuss the basic notion of probabilistic 
observables which do not have a sharp value in a given 
micro-state. 

1. Expectation values 

Consider a probabilistic system with N classical micro- 
states, labeled by cr = l...A^, and characterized by proba- 
bilities Per > , '^Pcr — 1. A classical or deterministic 

a 

observable A^^'^ is specified by N real numbers Aa-, such 
that the expectation value reads 

N 

(7 = 1 

In a given micro-state a the classical observable has a fixed 
value, namely A^- The probabilistic nature of the system 
arises only from the probabilities to find a given micro-state 
a. 

This concept can be generalized by introducing proba- 
bilistic observables, for which we can only give probabilities 
to find a certain value in a given micro-state a. Probabilis- 
tic or fuzzy observables are well known in measurement 
theory and have been investigated for quantum and clas- 
sical systems @, We give here a simple description 
of the properties relevant for our discussion. A probabilis- 
tic observable is characterized by a set of real functions 
Wrj{x) > 0, normalized according to / dxwc{x) = 1. The 
expectation values of powers of the probabilistic observ- 
ables in a given micro-state a obey 

A^ = J dxx^Wa{x) , A„ ^ J dxxwa{x). (3) 

Correspondingly, the expectation values in a macro-state 
of the probabilistic system reads 

{AQ)=Y,A^p., {A)=Y,A.P^- (4) 

Classical observables correspond to the special case 

w,{x) ^ 5{x - A„) , A"^ ^ {A^f . (5) 
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In this case all moments are fixed in terms of the 
mean value in the micro-state a, i.e. the moment for 
(5 = 1, j4cr ~ J dxxwa{x). In contrast, for the most gen- 
eral probabilistic observables the infinite set of moments 
A'^ may be used in order to parameterize the distribu- 
tion Wa{x). For the most general probabilistic observable 
much more information is therefore needed for its precise 
specification, namely infinitely many real numbers A'^ in- 
stead of the N real numbers A„ for a classical observable. 
The probabilistic nature of the system is now twofold. It 
arises from the probability distribution w^i^x) to find the 
value X of the observable in the micro-state cr, and from the 
probability distribution for the micro-states, {pa-}, charac- 
terizing a given macro-state or ensemble. The relation ([2]) 
for the expectation value of A remains valid for probabilis- 
tic observables. However, the expectation values of higher 
powers A'^ , Q > 2, as given by eq. ([4]), may differ from 
classical observables (cf. eq. ([SJ). 

2. Two-level observables 

As a specific example for a probabilistic observable we 
concentrate in this paper on the bi-modal distribution 

w„{x) = ^(1 + A,)Six - 1) + ^(1 - A,)d{x + 1), 

— 1 < Ac, < 1 , A^ — J dxxw„{x), 

j dxx^w^{x) = 1. (6) 

In any micro-state a the observable can only take the values 
+1 or —1. In other words, for a given micro-state a the 
observable is specified by the relative probabilities to find 
the values -t-1 or —1, 

w„+ = {\ + A„)/2,w„_={l-A„)/2. (7) 

Thus N real numbers A^ are again sufficient to specify the 
"two- level observables" obeying eq. ([6]) . The moments are 
given by 

.Q _ { A„ for Q odd , . 

~ \ 1 for Q even ' 

implying for the macro-state 

= for godd ^ 

\ 1 for g even ^ ' 

We may realize the ensemble or the macro-state by an in- 
finite set of measurements with identical conditions. Each 
measurement realizes a particular microstate, and the p„ 
give the relative numbers how often a given micro-state 
a is encountered in the ensemble. Since for any given 
micro-state the two-level observable can only take the val- 
ues -|-1 or —1, the series of measurements of A will pro- 
duce a series of values -1-1 or —1, with relative probabilities 
w± = ^(1 ± {A)). This is an easy way to understand why 
{A^) — 1 for arbitrary {p^}- The situation amounts ex- 
actly to a quantum mechanical spin 1/2-system, with an 
appropriate normalization of the spin operator, say in the 



^-direction, = {H/2)Sz' each measurement will give one 
of the eigenvalues ±1 of the operator Sz- We will see that 
the association of the probabilistic two-level observable A 
with a quantum-mechanical spin can be pushed much fur- 
ther than the possible outcome of a series of measurements. 
We will therefore often denote the two-level observables by 
"spins" , but the reader should keep in mind that we treat 
here with purely classical probabilistic objects. 

Two-level observables are the simplest non-classical 
probabilistic observables. By simple shifts they can be eas- 
ily generalized to any situation where an observable can 
only take two values (two "levels") in any given micro- 
state, like occupied / empty. One bit is enough for the 
possible values of the observable in a micro-state a, say 
for a; = — 1 and 1 for x = 1. Nevertheless, the specification 
of the probabilistic observable needs the real numbers Ac,- 
Instead of a continuous distribution Wa {x) we can replace 
eq. ([3]) by a discrete sum 

= ^ E ^'^(1+^^.)- (10) 

x=±l 

3. Substates 

A single two-level observable can be represented as a 
classical observable in an extended statistical system con- 
sisting of substates. As an example, we may associate all 
points within the circle in Fig. 1 with substates. (We may 
consider a finite resolution with a finite number of points or 
we can consider the limit where the number of points goes 
to infinity.) A characteristic two- level observable answers 













o=l \ 


up 






a=l ^^^^^^^^ 


down 


left 


right 





FIG. 1: Micro- states and two- level observables 



the question if the points are in the upper half plane or 
in the lower half plane. On the substate level the classical 
two-level observable ^4^^^ takes the value 1 for all points 
above the horizontal axis, and —1 for the points below the 
horizontal axis. If we denote the substates by r one has in 
every state a fixed value, Al't^ = ±1. 

We next consider two micro-states a = 1,2. The first 
state (cr = 1) corresponds to a coarse graining where all 
points in the upper half plane are grouped together (shaded 
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region in Fig. 1), while the second micro-state (cr = 2) com- 
bines the points in the lower half plane. The particular two- 
level observable A^-'^^ remains a classical observable, with 



or w 



(2) 



0. With 



sharp values in the micro-states a[^^2i — 1 i ^^=2 — 
We may also consider a second two-level observable A^^^^ 
for the decision between left and right. On the substate 

(2) 

level it is again a classical observable, with A)- ' = 1 for 

(2) 

all points on the right of the vertical axis, and A)f' = -1 
if the substates are represented by points on the left side 
of the vertical axis. On the level of the two micro-states, 
however, the observable has no longer sharp values. 
For a — 1 (shaded region in Fig. 1) the micro-state groups 
together points from both the left and the right half. Typi- 
cally, the observable A^^^^ has a distribution of values ±1 in 
this state, with I^I^L^) | < 1- Thus yl'^' has to be described 
by a probabilistic observable on the level of micro-states. 

We may label the substates r according to the microstate 
cr to which they "belong", r = (cr, t^)- (Here ta distin- 
guishes the states that belong to a given micro-state cr. 
For example, <cr=i are suitable coordinates for the points 
in the upper half plane in Fig. 1.) If the probability distri- 
bution pr on the substate level is known, the probabilities 
for the micro-states obey 



Pa 



(11) 



The mean value of the observable A in the micro-state a is 
given by 



Aa = A{a, tcr)p{a, t„)/pa 



(12) 



where p{a,ta)/P(j specifies the relative probabilities of two 
substates t^^ for a given cr. It is easy to verify that the en- 
semble average of A can be computed both on the substate 
or micro-state levels 

{A) = ^PrAr =^^p{(J,t„)A{(J,t„) 
T cr t„ 

= J2paA^. (13) 

For the observable A'^^ we may compute the probability 
to find A'^) = 1 in a given micro-state a as 



(2) 
"cr+ 



= ^P{'^,+,Sa+)/Pa 



(14) 



where we have further decomposed the substates as t^r = 
(o'',SCTcr') , cr' = (-1-,—). (The states with cr = 1 , cr' = -t- 
are represented by the points in the upper right quarter 
inside the circle of Fig. 1.) Unless p{a, — , So-_) = for all 

(2) 

substates (cr, — , S(j_) one has w^^ < 1. Similarly, one finds 
> unless p(cr, -t-, s„+) = for all corresponding sub- 
states. The probabilistic observable A^^^^ becomes deter- 

(2) 

ministic in the micro-state a (sharp value) only if = 1 



(2) , (2) 1 



= [^P{<^,+^S^+) -^p{a,~,S^-) I Pa 



(2) (2) 



(15) 



we find consistency with eq. ([T]). 

In the coarse graining step from substates to micro-states 
most of the information contained in the probability dis- 
tribution pr for the substates is lost. Instead of (infinitely) 



many numbers Pr only two numbers, w^'^ and w)^' , are nec- 
essary to characterize the expectation values of the observ- 
ables A^-^^f and A^'^'' and powers thereof. We will see later 
the correspondence between quantum states and micro- 
states - in both types of states observables have genuinely 
a distribution of values rather than fixed values. The state 
of a two-state quantum system will be fully characterized 
by the expectation values of three basis observables A^''\ 
The coarse graining from the substate level to the micro- 
states should be interpreted in an abstract sense rather 
than being associated to resolution in space. In a general 
sense, a quantum system can be regarded as an "isolated 
system" within its environment. For example, we may re- 
gard an atom as an excitation of the vacuum, similar to the 
conceptual setting of quantum field theory. The vacuum is 
a complicated system, involving infinitely many degrees of 
freedom, which may be associated to the substates r. In 
contrast, an atom, say in the ground state which admits 
only two spin polarizations, involves only a few degrees of 
freedom. These degrees of freedom can be associated with 
an "isolated system" and will be represented on the level 
of micro-states with probabilistic observables. 

As long as we consider only a single two-level probabilis- 
tic observable (say A^^^) a minimal implementation as a 
classical deterministic observable does not require a large 
number of substates r. It is sufficient to assume that each 
micro-state cr consists of two substates cr+ and cr— , for 
which the observable has either the sharp value -1-1 (for 
cr+) or —1 (for cr_). The probabilities for the substates are 
then given by ^ p„(l -I- A^)/2 , = pa{l - A„)/2. 
We emphasize that a representation of probabilistic observ- 
ables as classical observables on a substate level is possible, 
but not necessary. We could also consider probabilistic ob- 
servables as a basic, more general definition of observables 
and never refer to substates. This will be discussed in sect. 

ixm 



„(2) 



4- Operations among probabilistic observables 

Consider a probabilistic observable A with a discrete and 
finite spectrum of possible measurement values 7a. (The 
generalizations to a continuous or infinite spectrum are 
straightforward.) Besides the spectrum {7a} a probabilis- 
tic observable is characterized by the associated probabil- 
ities Wa(cr) for every micro-state cr. We can always define 
the multiplication with a constant c by A — s- cA : 7^ — > 
cja- Similarly, we may "shift" the probabilistic observable 
by adding a piece s proportional to the unit observable, 
A ^ A + s : 7a — > 7a + s. More generally, we can define 
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F{A) by 7a — > F{'ya)- However, the sum of two probabihs- 
tic observables A, B is, in general, not defined. For differ- 
ent probability distributions w^J^\(j) , w\^\a) it makes 
no sense to "add the possible measurement values" . For a 
similar reason the product of A and B is not defined for 
general probabilistic observables. 

A special situation arises if the joint probability to find 
simultaneously a value ^a^^ for A and 7^^' for B is known 
for every micro-state. (We occasionally use a simplified no- 
tation where ^i^'^ is replaced by a.) We denote this proba- 
bility by Wab(o-) and observe Waia) = Y.b Wab{o-) , Wb(cr) = 
Y^a^abif^)- In this case of "comeasurability" one may use 
the generalized definition 

^cr = ^ laWab{(y), (16) 
ab 



and similar for {f{A))rj or {f{B))a- We can now define 
A-\- B hy adding 7^ 4-76, and AB by multiplying 7a 76, 

{A + B)„ = Y^{la+lb)Wab{(j). 

ab 

(AB)^ = ^JalbWabio-). (17) 
ab 

This can be extended to general functions F{A, B) accord- 
ing to 

(F(A^)^ = E^(>'^&)^-&(^)- (18) 

ab 

We will see later that this special situation of comeasurable 
observables corresponds to commuting observables in quan- 
tum mechanics. On the other hand, the non-availability 
of a joint probability for general probabilistic observables 
will motivate us to describe sequences of measurements in 
terms of conditional correlations that do not involve the 
joint probabilities. This will turn out to be a basic in- 
gredient for the violation of Bell's inequalities in quantum 
measurements. 

For the example in Fig. 1 the spectrum 7a of the observ- 
able A^^^ consists only of two values ±1, and similar for 7;, 
for A*^^-*. The joint probability w++((t) for finding the val- 
ues A*^^^ = 1, yl'^) ~ 1 can be defined on the substate level 
as 

w++{a = 1) = , w++{a = 2) = 0, (19) 

with w)yj^ given by eq. (HH). (For a minimal set of sub- 
states there is only one state (ci, +) and therefore no sum 

of Scr+ needed, w),]^ — p{a,+)/pa ) The joint probability 
is available if one knows for each quarter of the circle the 
probability to be realized, i.e. if 

Pa- = ^p{a,-,s^^) ^ w'-flpa- (20) 



are known. With 

2 

Pa+ + P^T^ P<T , ^P^ = l (21) 

CT=1 

this requires the knowledge of three independent numbers. 
On the level of micro-states and probabilistic observables 
these three numbers are, in general, not available, since 
the state of the system may be characterized by only two 
numbers, (A^^^) and (A^^^). 

We will discuss in sect. I VIII that the absence of knowl- 
edge of joint probabilities is a crucial aspect for the defini- 
tion of correlations. The substate-probabilities ([20]) involve 
properties of the quantum system together with its environ- 
ment. They are not accessible by measurements involving 
only the quantum system. In sect. IXIIII we will see the 
connection between the availability of information about 
joint probabilities in a quantum system and commuting 
quantum mechanical operators. For the systems investi- 
gated in sects. ITVllXIII the combined probability Wab{<^) for 
two independent probabilistic observables will not be avail- 
able. This will be different in sect. IXIIII where we consider 
observables that correspond to commuting quantum oper- 
ators. For such "commuting observables" the joint prob- 
abilities are available. However, not all observables of the 
four-state quantum system discussed in sect. IXIIII are mu- 
tually commuting. In sect. IXIVI we argue that the missing 
joint probabilities are the key for the understanding of char- 
acteristic quantum mechanical features as the violation of 
Bell's inequalities. 



IV. SPIN ROTATIONS IN CLASSICAL 
STATISTICS 

We may start with a single two-level observable or spin 
and a system with only two micro-states, i.e. the states 
(+)(o- = 1) and (-)(ct = 2), with Ai = 1 , = -1. 
The expectation value reads (A) = pi — P2- Since for all a 
one has \Aa\ — 1, this special case corresponds actually to 
a classical observable. The distribution ^ involves only 
one ^-function, 'Wi{x) = S{x — 1) , W2{x) = 6(x + 1). If 
we assign instead the values Ai = A2 = we encounter 
a genuinely probabilistic variable, leading in this case to 
a random distribution of -1-1 and —1 measurements. Our 
example in Fig. 1 corresponds to this simplest case with 
two micro-states. The first (classical) two-level observable 
corresponds to A'-^\ the second (probabilistic) observable 

to 

An interesting case with two spin observables involves 
four classical micro-states, that we denote by (-I-1) or 
(^)(a = 1) , (-1) or i-7T){a = 2) , (+2) or (f ) (a = 3) and 
(—2) or (— f ) (ct = 4), according to the full dots in Fig. 
The corresponding values of A*^^^ and are shown in 
the left half of Table HI The expectation values of the two 
spins obey 

, (A(2))=P3"P4. (22) 
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FIG. 2: Location of micro-states and expectation values of spins 
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TABLE I: Mean values of spin observables in different micro- 
states 

We note that both A'^^^ and A'^'^^ are truly probabilistic ob- 
servables since in some micro-states they have a zero mean 
value and thus equal probability for -1-1 and —1 values. We 
could also include observables with opposite mean values. 
Since they are obviously just a multiplication of A'^^^ by 
— 1, we will not discuss them separately. 

On this level the probabilistic observables could again be 
expressed in terms of classical observables for a system with 
a higher number of states. We may assume that each micro- 
state consists of four substates with fixed values for A'^' 

and A^'^\ i.e. ++, H — ,— + and , such that we have a 

total of sixteen classical substates. Their probabilities can 

be denoted by , Per H etc., withpi_|-+ = ipi , — 

ipi , pi y — pi — and similar for the other a. On the 

level of the substates the observables A^^-* and A'-^-' arc clas- 
sical observables, with AIj'^ = Pct++ +Pa^ —Pa h ^Pc > 

Act = Pcr++ " Pa+- + Pa-+ - Pa (Sincc for half of 

the substates the probabilities vanish we could actually 
use a minimal set of eight substates.) When we will later 
discuss the time evolution of the ensemble, we will keep 
the relative probabilities for the substates, i.e. the ratios 
Pi++Ipi 1 P2H /P2 etc. fixed, according to the fixed en- 
tries in table m This defines the notion of fixed probabilis- 
tic observables. Again, this discussion demonstrates how 
probabilistic observables can be implemented in standard 
classical statistics with classical observables. We empha- 
size, however, that such an implementation is not neces- 
sary and we will consider the probabilistic observables as 
genuine statistical objects. 

For probabilistic observables we encounter features well 



known from quantum mechanics. There are states where a 
given observable cannot have a sharp value. For example, 
the spin yl^^^ has a mean value zero in the states (0) and 
(tt). For pi = p(o) = 1 J P2 = P3 = P4 = 0, we find a maxi- 
mum variance for A'^'^\ namely ((A^^^)^) — (A^^^)^ = 1. On 
the other hand, for the state (7r/2), i.e. pi = P2 = Pi = 
, P3 = P(-n/2) — Ij the variance vanishes and A^"^^ has 
a sharp value. In analogy to quantum mechanics we will 
denote the states where some observable has zero variance, 
i.e. (A)^ = 1, as "classical eigenstates" for this observable. 
We call the value of the observable in such a classical eigen- 
state the "classical eigenvalue" . For the spin we have 
two eigenstates, (7r/2) and (— 7r/2), with respective eigen- 
values -1-1 and —1. The setting of table 1 is analogous to 
two orthogonal spins in the quantum mechanics of a spin 
1/2 system. If one observable has a sharp value, the other 
has maximal uncertainty. 

We want to push the analogy with quantum mechanics 
even further and describe rotations in the plane spanned 
by the two spins within our setting of classical statistics. 
At this stage we encounter a problem. Rotating the pure 
state (0) by an angle 7r/4, we should arrive at expectation 
values = (A(2)) = l/\/2. This can not be realized 

in our system of four micro-states. Indeed, the sum of the 
components should be {A^^'^) + (A^^^) — V2, while for an 
arbitrary probability distribution {pa} we find the inequal- 
ity 

(Ad)) + (^(2)) ^p,_p2+p3_p4<l. (23) 

While the rotations of the state (0) should lie on the 
circle in Fig. [21 the allowed macro-states of our four- 
state system arc inside the square enclosed by the dashed 
lines. Only the four particular "pure states", where one 
of the Pa equals one, obey (Ad))2 + (Af^))^ = i. For 
Pi = P3 = I , P2 = P4 = one has (A^)) = (^(2)) = 1/2 
and therefore (Ad))2 + (A(2))2 ^ 1/2. 

One may improve the situation by considering 
a classical system with eight microstates. For 
this purpose we add four more states denoted by 
(7r/4) , (-7r/4) , (37r/4) , (-37r/4) - cf. the open circles in 
Fig. [21 The mean values of the spins and in each 
of these four additional states are shown in the second part 
of Table [H The average values in an arbitrary macro-state 
are given by the eight probabilities Pa according to 

+7l(%)+^(-f)~^(^)"^(-^))' 

(A(^)) =P(,)-P(-,) (24) 

+7l(%)"^(-f)+^(*)"^(-^))- 

A rotation by 7r/4 is now described by a change from the 
state (0) to the state (7r/4). We define as "classical pure 
states" the ones which have one probability pg exactly 
equal to one and the others vanishing, Pa^s = 0. We 
have now eight pure states, and for all of them one observes 
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(A(i))2 _|_ (^(2)^2 _ ^ rpj^g rotation of a pure state switches 
to another pure state. It is not reahzed by "mixed states" 
for which J2aPcr < 1- (Note that rotations for mixed states 
with + < 1/2 could, in principle, be real- 

ized by a suitable trajectory in the space of probability 
distributions {pi,P2,P3iPa}-) 

For the system with eight states we can also define two 
further two-level observables, namely A'^'"/^^ and A^~^/^\ 
These "diagonal spins" are specified by their mean values 
in the eight micro-states, as given by the entries in Table 
im At this stage they are not obviously related to linear 
combinations of A^^") and A^^^^ - in fact, linear combinations 
are a priori not defined for the two-level observables that 
can only take values -1-1 and —1 for any measurement. We 
will only later introduce a concept of linear combinations 
such that the diagonal spins correspond to a rotation of 
the spins A*^^-* and A^^', similar to quantum mechanics. 





(0) W (f) (-f) 
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1111 
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TABLE II: Mean values for "diagonal spins" in different micro- 
states. 

By suitable mixed states we can realize in the eight-state- 
system all expectation values of ^4'^) and A^^'' within the 
dotted octogone in Fig. [2l Rotations for mixed states could 
now be achieved by appropriate {pa} if (A^^)^ -I- (A^^^)^ < 
(2 + •\/2)/4. For example, p(o) — 5 : ~ ^ yields 

(Ad)) = i(l + ^) , (A(2)) = ^. For pure states, 

however, we still cannot realize the continuous rotations on 
the circle of Fig. [21 but only a discrete subset of rotations 
in units of 7r/4, corresponding to the Zg-subgroup of the 
50(2) rotation group. 

It is clear how to improve further by adding additional 
micro-states which interpolate closer and closer to the cir- 
cle. The rotation problem for two spins can be solved by 
considering infinitely many different micro-states, each cor- 
responding to a particular angle on the circle. With a finite 
number of N micro-states we can come arbitrarily close to 
the continuous rotations by realizing a .Z^v-subgroup. The 
full rotation group obtains in a well defined limit N —^ 00. 

The need of infinitely many micro-states for describing 
the rotation of a pure state in classical statistics should 
come as no surprise. By definition a pure state in classical 
statistics has zero variance for all classical observables. It 
is realized for probability distributions where one p^ equals 
one, while all others are zero. For classical pure states the 
statistical character of the probability distribution {p^} is 
lost and each macro-state corresponds to one particular 
micro-state. The continuous rotation of a spin variable 
therefore requires a continuous family of micro-states. In 
other words, the continuous rotation of a planet is not de- 
scribed by different mixed states in a probability distribu- 
tion, but just by different values of the angle which denotes 
the (deterministic) classical states which are pure states in 
a statistical sense. (Of course, pure states are only a very 



good approximation to the real statistical character of the 
planet.) The only thing that we have done in this section is 
a generalization of this situation from classical observables 
to the probabilistic two level observables. Nevertheless, 
this has a far reaching consequence: infinitely many clas- 
sical states are needed for the description of a two-state 
quantum system. 

In summary of this section, a description of spin rota- 
tions requires a continuous manifold of microstates. In 
the simplest case this manifold is the circle S^, parame- 
terized by an angle ip. More complex manifolds may in- 
volve additional parameters of coordinates. In case of a 
manifold of microstates , the probability of a particular 
state (fi or (/i,/2) is given by pa = p{(p) = pifk)- Here 
the cartesian coordinates" /i , /2 obey fi + fi = 1 and 
serve as a convenient alternative parameterization of the 
circle. For more complex manifolds p{(p) corresponds to 
a marginalized probability, where one integrates over all 
other parameters or coordinates. "Spins" are described by 
a continuous family of probabilistic two-level observables 
A{tp) = A{ek) , e\ + e\ = 1. Here tp or (61,62) indicates 
the direction of the spin. The mean value of A{ek) in the 
state /fc is given by 

2 

A„ = %(efc) = A^{ip) = ^kfk = cos(V' - ip), (25) 
and the probability for finding for A{ek) the value -|-1 obeys 

= ^(i+E^fe/fc) 

k 

= i(l + cos(i^-(p)). (26) 

The average value in the ensemble or macro-state reads 

{A) = {A{e,)) = Y.p{h)AfM 
{fk} 

j-2-K 

= / dipp{ip) cos{ip — if) . (27) 
Jo 

V. REDUCTION OF DEGREES OF FREEDOM 

In contrast to the infinitely many micro-states in classical 
statistics, it is impressive how quantum mechanics solves 
the rotation problem very economically: only two quantum 
states are needed, described by a two-component complex 
wave function. In fact, the classical solution to the rotation 
problem seems to be characterized by a huge redundancy. 
An infinite set of continuous probabilities Pa is employed to 
describe the expectation value of a spin observable, which 
can be characterized by only two continuous variables, the 
angle and the length ((Ad))2 + (A(2))2)i/2, Que may be 
tempted to reduce the number of degrees of freedom by "in- 
tegrating out" some of the micro-states, and assigning new 
"effective probabilities" Pa to the remaining micro-states. 
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Such a "coarse graining of states" can indeed be done - but 
the price is that the effective p„ can also take negative val- 
ues, or the sum of them may become larger than one. The 
effective probabilities can therefore no longer be considered 
as true probabilities. 

We may demonstrate this by consider- 
ing the system with eight classical states, 
(0),(7r),(f),(-f),(f),(-f),(f),(-f), and "inte- 
grate out" the states (f ) , (-f ) , (^) , {-^) in favor 
of new "effective probabilities" P(g'),p(^),p^^-j and P(-il) 
for the remaining four states. Let us first integrate out 
only the state (^). In order to keep the same expectation 
values for A^^'' and A^^^^ according to eq. ([M)) . we need 



P{Q) - P(^) = P(o) - PM 



1 



'(f)-P(-f) = %)-^(-f) + vl^(f)' 



(28) 



a I a — \ 



E 



For the sum this yields 

P(o)+Pw+P(^)+P(_^) (30) 
= P(o) +P{.) + V2(a + ^- l)p(,). 

Consider now the pure microstate ^ = 1 , p(o) — — 
P^il) = P(-.=^) ^ want p'^^^ and p'^ „^ to be pos- 

itive, we need a > 1 , /? > 1- However, in this case we 
obtain E > V2. We can therefore not keep simultane- 
ously X] — 1 ^i^d Pct positive! 

We may fix the coefficients a and /3as/3asQ!=^, = 

^. Integrating out also the states (— f) , (x) ^'^^'i ("T") 
we can express the effective probabilities Po- for the four- 
state system in terms of the probabilities of the eight- 
state system as 



P{o) = P{0) + 



1 



^ym^n-i)-m)-n--^))^ 



p(f) = ^(f) + ^(%)-^(-f)+^(¥)-?'(-¥);' 
1 



(31) 

It is easy to verify that also the observables A*^'^/^^ and 
jj-ggp ^jjg same expectation values after the reduc- 
tion of degrees of freedom. We can now compute the ex- 
pectation values of all four two-level observables by using 
eq. (121), with A„ given by the left half of Tables |T] and [TTl 



and Pa- replaced by pa- The only memory that we have 
started from a system with eight microstates is the modi- 
fied range for the effective probabilities pa- Since only the 
combinations p(o) ~ P{7t) ^nd p^-^r-j — P(-il) appear in the 
expectation values, all these statements hold actually for 
an arbitrary choice of a and /3 in eq. The actual 

range for p^ depends on the choice of a, /? - for the choice 
([3T|) we have p^ > -l/(2\/2) and J2aP'^ ^ 1- 

We may proceed one step further and also integrate out 
the states (tt) and (— f ). We denote the resulting effective 
probabilities by pi for the state (0) and p2 for the state 
(■|). They are given by 



Pi 



P2 



P(0) 



P(tt) =P(0) -P(7r) 



PH 



)-p{~i) 



(32) 



+7!(^(f)"^(-f)+^(^)"^(-^)) 



In the reduced two-state-system the expectation values for 
the spins simply read 



P2 



(33) 



{A"-^"^) = ^(Pi+P2), {A^--/^^) = ^{p,-p2). 

The ambiguity associated to the choice of a and [3 in the 
previous step has now disappeared and the pk are fixed 
uniquely in terms of the eight original pa ■ This uniqueness 
follows directly from cq. ([551) . since the pk are associated 
to expectation values which do not change in the course 
of the reduction of degrees of freedom. One finds for the 
range of the effective probabilities pk for the reduced two 
classical states 



1 < Pfc < 1 



k 



(34) 



with X]fc Pfc = 1 precisely for the eight original pure clas- 
sical states. The two state system is the minimal system 
which can describe the two spins 

We could have started our reduction procedure with 
some other classical system with 2*^ micro-states. Pro- 
ceeding stepwise by reducing M consecutively by one unit, 
one finally arrives again at the two state system, with ef- 
fective probabilities pk obeying the constraints ([341) and 
expectation values (yl^'^') = p^. Starting with infinitely 
many micro-states, M — s- oo, one finds that arbitrary val- 
ues of Pk obeying < 1 can be realized. This follows 
directly from the observation that the expectation values 
of A^*^^ can take arbitrary values within the unit circle, 
^^(^('s))2 < and eq. Starting with a statistical 
system with finite M leads to further restrictions on the 
allowed range of pk ■ This range simply coincides with the 
allowed range for {A^). For A/ = 3 it is given by the 
dotted octogone in Fig. [51 

We may also investigate systems with a third indepen- 
dent two level observable ^^^^ The first step of our con- 
struction involves now six micro-states with mean values 
of the bi-modal observables shown in Table IIIII 
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TABLE III: Microscopic mean values for three "orthogonal" 
two-level observables 

The solution of the three-dimensional rotation problem 
by infinitely many micro-states proceeds in complete anal- 
ogy to the discussion above, with an interpolation of the 
pure states towards the unit sphere 5^. Also the reduction 
to a smaller number of states by "integrating out" some of 
the micro-states is analogous to the two-dimensional case. 
The minimal system has three effective states, with effec- 
tive probabilities pk,k = 1, 2, 3, obeying again the restric- 
tions (1211) and (AC^)) = pk. 

In summary, the manifold of microstates must con- 
tain the submanifold S"^, parameterized by two an- 
gles (i?, (/?) or a three dimensional vector of unit length 
(/i: /2, /a) , /fe = 1- I'^ case of the minimal manifold 
the classical statistical systems are given by probability 
distributions p{fk) or p{'&,ip). Again, for extended man- 
ifolds of states we consider marginalization. We observe 
that S*^ corresponds to the manifold of normalized pure 
states for two-state quantum mechanics, i.e. the projective 
Hilbert space. Probability distributions on the projective 
Hilbert space have been discussed by [l3|, 0, [l^, @. 
The reduction of redundant degrees of freedom maps the 
probability distributions on S*^ onto three real numbers, 
Pifk) Pk, according to 

Pk = J2 P^fk)^^ = I dn p{d, ^)fki^, ifi). (35) 

From J2k fki"^! (ys) < 1 and / dilp{ip, d) — 1 one derives the 
inequality 

(36) 

k 

For a system with infinitely many micro-states we can 
also define infinitely many two-level observables. A given 
spin may be denoted by a three-vector A, which can 
take an arbitrary direction in the cartesian system defined 
by the orthogonal "directions" corresponding to pi,p2,P3- 
We may characterize the direction of the spin A by the 
cartesian coordinates of a three dimensional unit vector 
Ck, X]fc Cfc = 1- ^ more accurate notation for A would be 
A{ek), since the measurements of A will not yield three real 
numbers, but only one with values +1 and —1. The mean 
value of A(efe) in the micro-state fk is given by 

3 

%(efe) = Ee,/fc, (37) 

k=l 

and, using eqs. ([55)1 . ([U, the expectation value of A obeys 



the intuitive simple rue (cf. eq. ([33)) ) 

3 

(A) = (A(efc)) -^efcpfc. (38) 
fc=i 

This key ingredient of our formalism will be addressed 
more formally later. At this place we observe that the 
three numbers pk are given by the expectation values of 
three basis observables A^''^ 

Pk - {A^"^). (39) 

We will see that these numbers specify the state of the 
quantum system completely. In other words, we may con- 
sider the quantum system as a subsystem of a larger clas- 
sical system with infinitely many degrees of freedom. Its 
state can be characterized by the expectation values of a 
number of basis observables - the three spin observables 
in our case. The expectation values pk cannot all take 
arbitrary values between —1 and -f 1. In our case they are 
restricted by the condition J2kPk — ^■ 

VI. DENSITY MATRIX 

In the following we will concentrate on manifolds of 
micro-states for which the inequality ([5S)) holds. This is au- 
tomatic for the minimal manifold 5^, but may restrict the 
probability distribution p^ for larger manifolds of micro- 
states. If eq. (1551) is obeyed, the expression ([55)) can be 
brought into a form familiar from quantum mechanics. We 
associate to each two- level observable A{ek) a 2 x 2 matrix 

k 

with Tk the three Pauli matrices obeying the anticommu- 
tation relation {rk, ti} = 25 ki- Similarly, we may group the 
effective probabilities pk into a "density matrix" p, 

P= + (41) 

k 

In terms of these matrices the expectation values obey the 
quantum mechanical rule 

{A{ek)) = tr(i(efc)p). (42) 

In a quantum mechanical language the "operator" A is 

precisely the (unit-)spin operator S in the direction of e. 
For the infinite system we may therefore switch to the fa- 
miliar spin-notation A — > 5. In conclusion, we have es- 
tablished that the classical system with infinitely many de- 
grees of freedom precisely obeys all quantum mechanical 
relations for the expectation values as given by 

(S) = tr(^p). (43) 

In particular, all relations following from the uncertainty 
principle are implemented directly. For a quantum me- 
chanical two state system all information about the statis- 
tical state of the system is encoded in the density matrix, 
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which obeys the usual relations tr p = 1 , pn > , p22 > 
, tr < 1, and tr = 1 for pure states. For our 
classical statistical system these relations follow from the 
definition (|4ip and the range for pt in eq. (|34p . The spin 

operators S arc the most general hermitean operators for 
the quantum mechanical two-state system. On the level 
of expectation values of operators we have constructed a 
one to one mapping between the quantum mechanical two- 
state system and a classical system with infinitely many 
micro-states. 



VII. CONDITIONAL CORRELATION 
FUNCTIONS 



There are several ways to define correlation functions in 
a statistical system. The basic issue is the definition of 
a product between two observables A and B. The prod- 
uct AB should again be an observable. The (two-point-) 
correlation function is then the expectation value of this 
product, {AB). The choice of the product is, however, 
not unique. We have already demonstrated earlier how 
a quantum correlation function can arise from a classical 
statistical system Q. 

1. Classical product For classical or deterministic ob- 
servables the first candidate for a product between two ob- 
servables A, B is the pointwise or classical product 



[A ■ B)r ^ ArBr 



(44) 



In our approach, this definition is only possible on the level 
of substates r. For probabilistic observables on the level of 
micro-states, however, this type of product exists only for 
special cases, but not in general. We have argued in sect. 
II that the product of two probabilistic observables can be 
defined in a straightforward way only if they are "comea- 
surable" , i.e. if the joint probabilities Wab for finding the 
value 7a for A and 7h for B are available, cf. eq. ([T7)) . In 
this case the product AB in eq. ([17]) is equivalent to the 
classical product ([ii]) on the substate level. In a quantum 
mechanical language, this type of structure can only be re- 
alized if the associated operators A and B commute UM , 

In general, the property of comeasurability of two ob- 
servables is lost when the substates are projected on the 
micro-states. Many different deterministic observables on 
the substate level, as characterized by their values Ar in 
the substates r, are mapped into one and the same proba- 
bilistic observable. (The latter is characterized by A^ for a 
two-level observable, and more generally by the spectrum 
7a and the probabilities Wa{<y)-) Let us denote by A^- and 
A'^ two different classical observables on the substate level 
that are mapped into the same probabilistic observable A. 
While (A) = {A'), the classical product with a different ob- 
servable Br (which corresponds to a different probabilistic 
observable B) is, in general different for A and A' 



In consequence, there cannot be a unique classical product 
between the probabilistic observables A and B. We con- 
clude that the classical product {A ■ B) is a property of the 
system and its environment, since it involves information 
that is only available on the substate level. For measure- 
ments in the (sub-) system this information is not acces- 
sible. Predictions for measurements of the system have to 
be formulated on the level of micro-states involving an ap- 
propriate product for the probabilistic observables A and 
B which has to be determined. 

2. Pointwise product for probabilistic observables 

One possibility for a product of probabilistic observables 
is the "probabilistic pointwise product" that we denote 
with a cross, A x B. It is defined by the multiplication 
of the mean values of A and B in every micro-state 



[A X B), = A„Ba 



(46) 



In other words, one multiplies the probability to find a 
value xa for A with the probability to find xb for B, 



{A X B)„^ I dXAdxBXAXBwl'^^XA)w^f'{xB)- (47) 



Thus A X B is again a probabilistic observable, with 



Wa{x) — / dxAdxB5{x — xaxb)w{xa)w{xb)- (48) 



Using the discrete formulation (|T0)) for the two-level ob- 
servables one has 



{A X B)a = w 



AB 



..AB 



(49) 



with the combined probability to find for the micro- 
state a a value +1 for A and -1-1 for B, or —1 for A and — 1 
for B, such that the sign of the product of values of A and 
B is positive. Similarly, obtains from the situations 

where the respective signs are opposite 



w 



AB 



(50) 



The probabilistic pointwise product of two two-level ob- 
servables is again a two-level observable. 

The probabilistic pointwise product is commutative, and 
the corresponding pointwise correlation function equals the 
classical correlation function if A and B are classical ob- 
servables. However, the pointwise product is not the prod- 
uct that leads to our definition of A^ for the two-level ob- 
servables, where (A^) — 1 independently of (A). For the 
pointwise product one finds instead 



{A 



Ai<l. 



(51) 



{A-B)^ {A' ■ B) 



(45) 



The saturation of the bound obtains only for the two pure 
classical states which correspond to the particular micro- 
states a for which A = ±.1. This clearly indicates that the 
probabilistic pointwise product A x B cannot be used for 
the correlation between two measurements. 
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3. Conditional product 

The reason for this discrepancy is the implicit assump- 
tion that in eq. (|50p the probabihties w±. and w±. are in- 
dependent of each other. This does not reflect a situation 
where the product AB describes consecutive measurements 
of first B and subsequently A. Once the first measurement 
of A has found a value -fl, a subsequent measurement of 
the same variable should find a value -1-1 with probability 
one. Then {A'^)^- = 1 follows independently of the value 
Ag- if the only allowed values are ±1, as for our spin vari- 
ables. We therefore define a different product Ao B which 
involves conditional probabilities. Eq. (j50p is replaced by 



holds, as in our case, one simply finds from eq. (I55p 



{AoB)^ = {A)+B = tr{ApB+) , 



(Bo A), - {B)+A = tr{BpA+), 



(57) 



independently of the micro-state a. For our two state sys- 
tem the pure state density matrices are easily found 

PB± ^ \il ± B) , pA± = ± A) (58) 

such that 



W 



AB 



AoB 



„AoB 



A\B„,.B 



(52) 



Here (w^):^ denotes the conditional probability to find a 
value -f 1 for the measurement of A under the condition 
that a previous measurement of B has yielded a value -1-1. 
With {w^)i = 1 , (w^)^ = 1 , {w^)^ = , {w^)i = 
one now has w^-^^ = ^ + = 1 , w^'^ = 0, such that 

[Ao A)a — 1. More generally, we define the "conditional 
product" AoB according to 



{AoB), 



,AB 



..AB 



(53) 



with the new definition (j52[) . This product underlies our 
definition ol A^,{A^) = {A o A) = 1. The conditional 
product of two two-level observables is again a two-level 
observables, with probabilities given by eq. ([5^ . 

The definition of the conditional product AoB requires 
a specification of the conditional probabilities {w^)^ etc.. 
After the measurement of B we know for sure that B has 
the measured value, say -1-1. The probability of finding 
again -fl in a repetition of the measurement must be one. 
This is the property of a classical eigenstate of S, that 
we denote by (+_b). We therefore take for the conditional 
probabilities 



{A)+B = {B)+A = ^tr{AB). 



(59) 



This shows the commutativity of the conditional product 
of two probabilistic two-level observables 



{AoB), = {Bo A), 



(60) 



It is instructive to compute the conditional product of 
two basis observables 



1 for k = I 
R for k ^ I 



(61) 



Here R is the "random two-level observable" which obeys 
for every micro-state a 

R,^0, = w^,, = I , (iP), = 1, (62) 



and therefore 



(i?) = , = 1. 



(63) 



We stress that R is different from the "zero-observable", 
which takes a fixed value in every micro-state. For an 
arbitrary two-level observable A one finds the relation 



{^ir+ = \{i±{A)+b). 

{^±)- = 1{1±{A)^b), 



RoA^R. 



(64) 



(54) 



with {A)±B the expectation value of A in the pure classical 
states (±b), and (— s) the eigenstate of i? with eigenvalue 
— 1. For our particular system corresponding to two-state 
quantum mechanics the conditional probabilities are actu- 
ally independent of the micro-state cr. They depend only 
on properties of the states (+b) or (— s). We obtain 



{A o B)^ = - = - {A)^BW^^, 

= ^{l+B,){A)+B-l{l^B,){A).B. (55) 

At this stage the conditional product is not obviously com- 
mutative. 

However, wc note that whenever 



{A)^B = -{A)+i 



(56) 



The random observable is a special two-level observable 
since no micro-state can be an eigenstate to i?, the eigen- 
values being ±1. The product ^oi? is therefore not defined 
since it would require a projection on eigenstates of R after 
a first "measurement of i?" . 

4. Conditional correlation 

Let us next compute the probability that a sequence of 
a first measurement of B and a subsequent one of A yields 
the results (+,+), 

W^^ ^ {wi)^wf,, = ^{l + {A)+B){l + {B)s) 

= i(l + ltr{i,B})(l + tr(Sp)) (65) 
and compare it with the sequence in the opposite order 
W^t - {Oi<s = |(l + |tr{i,B})(l + tr(ip)). (66) 



14 



The probabilities for the sequences in different order are 
not equal. If we realize the probabilistic observables A, B 
as classical observables on a substate level (with sharp val- 
ues Ar^B-r = ±1 in any given substate r), we could also 
compute a "classical probability" W^^ that both A and 
B "have" the value +1. This would be given by the sum 
of the probabilities for all states r for which Ar — Br = 1- 
The order does not matter for the classical probability, 



W^^ = Wff . Clearly, the probabihties W^f and W^^ 
m eqs. ((65)1 . ([66)1 differ from l^ff . This demonstrates 
that our definition of the conditional probabilities (w+)+ 
etc. is not equivalent to the "classical conditional proba- 
bility" , which can be obtained from and appropriate 
marginahzations. While W^'^ ^ Wf^ and W^^ ^ W^^, 
the probability of finding for the product of the two mea- 
surements the value -1-1, namely 

wi^, = W^^ + W^^ = i(l + ^tr{i,B}), (67) 
does not longer depend on the order 



(68) 



This "loss of memory of the order" for the sum (p7)) is the 
basis for the commutativity of Ao B. 

The "conditional correlation" is defined as 

{AoB) = Y,p,(A^), = (A)+B<. - 

= i(l + (S»(A>+B-^(l-(i?»(A)_s, (69) 

with wf^^ = J2aP'y'^±.a and (B) = Y.aP'y^a = 
Y^aP<yi'^+,(y ~ — ~ ""^-,5- For our orthogonal 

spin observables A'^^^ it has the simple property 



(70) 



since {A'^^^) = for k ^ I. The conditional correlation 
reflects the properties of two consecutive measurements. It 
may therefore be more appropriate for a description of real 
measurements than the probabilistic pointwise correlation. 

A priori, the order of the measurements may matter, i.e. 
{B o A) may differ from (AoB), but based on eq. (|60p we 
conclude that the (two-point) correlation is actually com- 
mutative 



(AoB) = (Bo A). 



(71) 



We will use in the next section the mapping to quantum 
mechanics and give a general expression of {AoB) in terms 
of the anticommutator of the associated operators A, B, 



(AoB) = -tr({i,B}p)). 



(72) 



From eq. ([7^ the commutativity of the conditional corre- 
lation is apparent. 



5. Conditional three point function 

The commutativity of the conditional two point corre- 
lation does not extend to the conditional three point cor- 
relation. We first define the conditional product of three 
two-level observables as 



{AoBoCl ^ {wf)l{wl)%w%^^~{wi)l{wlf_w^_^^ 



i A\B / B\C C I / /^\ts f ts\ 

(w+)+w+.^ + (w_)+(w+) 



A\B i„..B\C „,C 



A\B i„..B\C „„C 



+ (u;-)-(u;-)^<,-(«;^)-(«,-) 



(73) 



It is constructed in analogy to the conditional two point 
function and involves in an intuitive way the probabilities 
of finding for the measurements of A^B,C the sequences 
(+, +, +), (+, +, -), (+, -,+)... (-,-,-), weighted with 
the appropriate product of the measured values. After a 
measurement of ±1 of C the observable B is measured in 
the (±c) eigenstate of C, and after a second measurement 
of ±1 for B the observable A is measured in the (±b) 
eigenstate of B. For the orthogonal two-level observables 
eq. (173)) yields 

A^^') o A(') o = J'^'A^"') + (1 - 5^^)R. (74) 



The conditional three point function 

{AoBoC) = ^p„{AoBoC)c 



(75) 



obtains from eq. ()73|) by the replacement w'^ ^ 
Similarly to eq. (|69p . the conditional three point function 
can be expressed as a product of expectation values 

{AoBoC) = \[{A)+B (76) 
[(1 + (i?)+c)(l + {C)) - (1 + (S>-c)(l - {C))] 

HA)-B 

[(1 - (i?)-c)(l - (C)) - [(1 - (i?)+c)(l + {C))] }. 

The quantum mechanical computation in the next sec- 
tion shows that the conditional three point correlation can 
be expressed as 



{AoBoC) 



\tr({{A,B},C]p). (77) 



It is therefore invariant under the exchange of A and -B, 
but not with respect to a change of the positions of B and 
C or A and C. For the orthogonal spin observables one 
finds from eq. (|77| 



(78) 



in accordance with eq. (|7i)) . We recall that all expecta- 
tion values in eq. (|76)) are well defined in our setting with 
infinitely many micro-states, such that the computation of 
{Ao B o C) can be done entirely within classical statistics. 
The non-commutativity is a consequence of the definition 
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of the conditional product, which is adapted to a sequence 
of measurements in a given order. 

The conditional product is not associative. This can be 
shown most easily by using the products of two and three 
basis observables (|6T|) . ([74)) . which yield 



I for k = l 

\ R ioi k^l 

AC') o o (79) 



On the other hand, A^''^ o (A^) o is not defined for 

I ^ m since o A^™) has no eigenstates, while for I — m 
the result A^'"' differs from eq. ([79]). The lack of com- 
mutativity of the product A o B o C arises from the lack 
of associativity. Eq. ([75)1 can be generalized to arbitrary 
two-level observables 



Ao BoC = {Ao B)oC. 



(80) 



The observable Ao B can be evaluated in eigenstates pc+ 
and PC- of C such that {A o B) o C is well defined. The 
probability of finding {A o B) o C ~ +1 is given by 



{AoB}oC 



, {AoB)-.c C I / V-^"^A 



(AoS)^c„„C 



(81) 



and similar for w^_^°^^°'~' . With 



^^{AoB)^c ^ (w'^)^{w"r: + {w'l)%{wX)'l (82) 



,A\Bf„,.B\C 



A\Bf„,.B\C 



we indeed find for w''_^°^"^°^ all terms in eq. (|73p with a 
plus sign. Again, A o [B o C) is in general not defined. 

6. Measurements as operations 

For our two-state quantum system the density matrix for 
an eigenstate of the observable A or i? is unique, given by 
eq. ([55)). and describing always a pure state. (This does 
not hold for more than two states.) We can describe the 
measurement process by a series of "classical operations" . 
The first measurement of C operates a mapping C 



C : w'l ,.pc+ - wE'^sPc- 



PC, 



(83) 



where pc is a weighted sum of density matrices, but not a 
density matrix itself. If this is the only measurement, the 
expectation value of C obtains by taking a trace of pc, 

(C) =trpc (84) 

A second measurement of B induces a mapping B 

B: pc+^{wf)^pB+-{w^)^pB-, 

pc-^{wlf_pB+-{w^)'lPB-, (85) 
such that the sequence of two operations reads 

BC : p-^pBC = [{wl)%w%^^- {wlf_w^_^,]pB+ 

+ [(z.^)V,.-(z.^)50b-. 

(86) 



Again, if the measurement chain is finished one takes the 
trace of psc for the evaluation of the expectation value 
of the products BoC, reproducing eq. ([69]l . The non- 
commutativity of the classical operations is manifest. For 
example, after the second step pBc is a linear combination 
of pB+ and pb.-, while pcB involves a linear combination 
of pc+ and pc- ■ This chain of operations can be continued. 
A third measurement of A maps 



A : PB± iw^)j_pA+ 



(87) 



such that taking a trace of pabc after the third measure- 
replaced by ^ , as 



ment reproduces eq. ([75)1 with w'^ „ 
appropriate for {Ao B o C). 

A general physical measurement process, both for clas- 
sical statistical systems and for quantum systems, involves 
three basic ingredients, (i) Records indicate the state 
of some measurement device (apparatus) after a mea- 
surement. In general, there may be i?i possible values 
mi(ri) , ri = 1 . . . for the record of the first device. In 
our case of two-level observables the record of an appropri- 
ate apparatus involves only one bit, ri = 1,2 , r7ii(l) = 
+1 , mi(2) = -1. 

(ii) State reduction describes the infiuence of the mea- 
surement on the state of the system. This is irrelevant 
if only one measurement is performed, but crucial for the 
outcome of a sequence of several measurements. Let us 
consider "minimally destructive measurements" . For each 
given record mi the state of the system becomes after the 
measurement an "eigenstate" with "eigenvalue" mi. This 
means that the measurement of mi simply eliminates from 
the ensemble all states which would have a non-zero prob- 
ability that a repetition of the same measurement would 
yield a different record mi ^ fhi. No further modification 
is made for the ensemble. The original state "splits" into 
Ri different alternatives of "histories" which do not influ- 
ence each other. A second measurement, with an appara- 
tus with i?2 possibilities, yields new records m2(r2). For 
a subsequent measurement this second step is performed 
for the i?i alternative outcomes of the first measurement 
separately. A convenient way to visualize the situation is a 
sequence of two Stern- Gerlach type measurements, where 
the second measurement is performed by two identical de- 
vices placed in the two beams into which the incoming 
beam is split after the first measurement. State reduction 
after the combination of the two measurements produces 
R1R2 different alternatives. Arbitrary sequences can be 
constructed in this way. 

(iii) Evaluation of the value of the measured observable is 
some rule how the different records mi , m2 . . . are mapped 
to the value V of the measured observable, which is a real 
number. In our case of a sequence of measurements of three 
two-level observables the value of the observable Ao B o C 
is given by = mim2m^. For ruj = ±1 the value V may 
take the value ±1. More generally, the spectrum of possible 
values for V may consist of Rt different possibilities, rt = 
1 . . . Rt , V = mt{rt). Our prescription is general enough 
to include observables which are measured by a "chain of 
individual measurements" . 

For a prediction of the probability Wt for finding the 
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value rrit from the combination of a chain of several individ- 
ual measurements the state reduction is a crucial ingredi- 
ent. Indeed, wt obtains by summing the probabilities of all 
alternatives or histories for which the records (mi, TO2, . . . ) 
are mapped into a given nit by the evaluation of the ob- 
servable. The probability of a given history, labeled by 
(mi, m2, . . . ), multiplies the probability of finding mi (for 
a given state of the systems) with the probability of find- 
ing 1712 in the eigenstate with eigenvalue mi resulting from 
the state reduction of the first measurement and so forth. 
The prescription for the state reduction is, in general, not 
unique since many different states may be eigenstates with 
a given eigenvalue mi. The definition of a "combined ob- 
servable" as Ao B needs a specification of the appropriate 
state reduction. 

7. Choice of correlation function 

Several interpretations can be given for the use of condi- 
tional probabilities. One refers to the change of knowledge 
of the observer after the first measurement. The second 
one assumes that the physical state of the system has been 
changed after the first measurement through interaction 
with the apparatus. This does not involve an observer. Fi- 
nally, one may take the point of view that physical theories 
only describe probabilities for different possible histories, 
while reality has only one given history. If part of this 
given history is revealed, for example by the first exper- 
iment, the possibilities contradicting the outcome of the 
first experiment can be eliminated. In this view the essen- 
tial part of a physical theory are the correlations between 
different events. These correlations do, in general, not in- 
volve an observer. From the mathematical point of view 
all those interpretations are described by the same condi- 
tional probability. In a quantum mechanical language this 
corresponds to the famous "reduction of the wave function" 
after the first measurement. 

On the level of micro-states the classical correlation 
{A ■ B) is not available since the joint probabilities are 
not defined. Both the probabilistic pointwise correlation 
{A X B) and the conditional correlation {A o B) describe 
idealized measurements. The probabilistic pointwise cor- 
relation assumes that two measurements are completely 
uncorrelated on the level of the microstates. Suppose that 
the probability distribution for the micro-states singles out 
a particular micro-state for which the observable A has no 
sharp value. Describing two consecutive measurements of 
A by the probabilistic pointwise correlation corresponds to 
a situation where after the first measurement of A the sys- 
tem relaxes such that the system has lost memory if the 
first measurement has found the value 4-1 or —1. In con- 
trast, the conditional correlation keeps this memory. It 
idealizes that one has exactly an eigenstate of the mea- 
sured observable after the first measurement. In a real 
measurement situation there will always be some uncer- 
tainty in the measured value and there are possible physical 
influences between the first and second measurement. This 
would result after the first measurement in a state that is 
not precisely an eigenstate. In principle, one could define 
modified correlation functions in order to account for such 
effects. Obviously, the process of performing a sequence of 



n measurements and multiplying the measured numbers, 
and then averaging over many such sequences under iden- 
tical conditions, has the necessary product properties for 
the definition of an n-point correlation. 

Our close association of the correlation functions with 
sequences of measurements underlines that the definition 
of the correlation function is not unique. In principle, a 
classical statistical system admits many different possible 
definitions of product structures and corresponding corre- 
lation functions. The most appropriate choice may actually 
depend on the detailed physical circumstances. Besides the 
probabilistic pointwise product Ax B and the conditional 
product Ao B we recall that the "classical product" A ■ B 
can be realized if the probabilistic observables are realized 
as classical observables on a substate level. On the level of 
substates r the classical product {A ■ B)r = ArBr, 

involves the sharp values A^ , Br of the observables A and 
B in the substate r. The classical product can be asso- 
ciated to the elimination of substates that have values of 
B different from the value found in the first measurement. 
This state reduction needs, however, a specification of the 
precise observable Bt that is measured, and not only of the 
associated probabilistic observable. Classical correlations 
therefore correspond to measurements where the properties 
of both system and environment are determined simultane- 
ously. Such measurements do not correspond to measure- 
ments of the system properties, which only should employ 
information contained in the system, but no information 
about the precise state of the environment. 

Furthermore, for the classical product Bell's inequali- 
ties can be directly applied and lead to contradiction with 
observation. This may be interpreted as experimental evi- 
dence that classical correlations should indeed not be used 
for a description of measurements of properties of an iso- 
lated (sub-) system. On the other hand, the conditional 
product yields precisely the prediction of quantum mechan- 
ics for the possible outcomes of two measurements. We will 
therefore postulate that two measurements should always 
be described by the correlation function {A o B) which we 
may call the "quantum correlation" . This should also hold 
for situations where no clear time ordering of the measure- 
ments of the observables A and B is possible. 

Both the classical correlation {A ■ B) and the correla- 
tion {A o B) are conditional correlations in the sense that 
they describe a way how possibilities contradicting the first 
measurement are eliminated. This demonstrates that the 
general notion of a conditional correlation is not unique. 
Any probabilistic theory must therefore not only specify a 
rule how expectation values of observables are calculated, 
but in addition also rules for the "measurement correla- 
tion" {AB)m which specify the outcome of measurements 
of pairs of observables. The various possibilities for def- 
initions of conditional probabilities arise from the simple 
observation that it is not sufficient to state that all possi- 
bilities contradicting the first measurement are eliminated 
after the first measurement. This can be done in different 
ways. One also has to specify which information is retained 
and therefore available for the second measurement. The 
classical correlation {A ■ B) can be used only if the precise 



observable Br, which measures properties of the environ- 
ment in addition to properties of the system, is specified 
for the first measurement. A good measurement, which 
docs not destroy the isolation of the subsystem, should not 
require the knowledge of properties of the environment for 
a determination of the available information after the first 
measurement. It must be possible to specify the state of 
the subsystem after the first measurement by using only 
information which characterizes the properties of the sub- 
system. The correlation (AoB) has precisely these proper- 
ties. We therefore propose that for an optimal "minimally 
destructive" measurement in a quantum system the mea- 
surement correlation should be given as {AB),n — (AoB). 



VIII. QUANTUM STATISTICS FROM 
CLASSICAL STATISTICS 

So far we have shown important analogies between the 
quantum mechanics of a two state system and classical 
statistics with infinitely many micro-states. In this section 
we will argue that all aspects of quantum statistics can 
be described by the classical system. Quantum statistics 
appears therefore as a special setting within classical statis- 
tics, where a particular class of probabilistic observables is 
investigated and a particular correlation is used. Inversely, 
the formalism of quantum mechanics is a powerful tool for 
the computation of properties in classical statistical sys- 
tems, as the conditional correlation functions. 

1. Expectation values 

A first basic ingredient of quantum mechanics is a de- 
scription of the rule for the computation of expectation 
values of observables. As before, we restrict the discussion 
to two quantum states. At any given time the information 
about the state of the system is encoded in the density 
matrix p, which is a hermitean 2x2 matrix, p = with 
< pii < 1,0 < P22 < 1, tr p = 1, tr p2 < 1. (Quan- 
tum mechanics provides also a law for the time evolution 
of p, to which we will turn in the next section.) Quan- 
tum mechanics makes probabilistic statements about the 
outcome of measurements. They are predicted by the ex- 
pectation values for hermitean operators A, according to 
(A) =ti{Ap). For the two state system, the only hermitean 

operators are the (unit) spin operators S up to an overall 
multiplicative factor. (Pieces proportional to the unit op- 
erator may be added trivially.) We have already shown, eq. 

that the law (S) — tr {Sp) is obeyed by the classical 
system with infinitely many micro-states. Also the den- 
sity matrix with the required properties can be computed 
from the classical probability distribution {pa}, using the 
method of reduction of degrees of freedom. For the contin- 
uous family of classical spin observables we therefore have 
already established that the expectation values obey the 
quantum mechanical law. 

The quantum law for expectation value can be used 
whenever the expectation value of an observable can be 
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written in the form 

(A) = CkPk + eo = tr(ip) , A = CkTk + cq. (88) 

This holds for all probabilistic observables for which the 
probabilities to find a given value in the spectrum can be 
computed from pk by a linear relation. We will call such 
observables "system observables" . "Quantum observables" 
have the additional property that f{A) is represented by 
the operator f{A). This holds for the two-level observables 
associated to spins, but not for the random observable R. 
The quantum observables are a subclass of the system ob- 
servables. 

2. Pure states 

A second basic concept in quantum statistics is the 
Hilbert space of states They describe pure quan- 

tum states by complex two-component normalized vectors, 
= ■0 J ("01 = with {4>\4') = ^^"0 — 1- Pure quan- 
tum states have a density matrix obeying tr = 1 or 
Sfc Pk ~ ^- ^6 have seen, they correspond to the pure 
classical states where {pa-} has one value one and only zeros 
otherwise.) The overall phase of \tjj) is unobservable and 
therefore irrelevant. Only two real numbers are needed in 
order to describe the physical properties of {ip), in corre- 
spondence to the two independent real numbers pk which 
remain under the condition = 1. One can therefore 

construct a mapping between a pure density matrix p and 
the associated state up to an arbitrary phase e*"^. The 
mapping is straightforward for diagonal p 




Any hermitean p can be diagonalized by a unitary SU (2) 
transformation, UW = 1 , pd diagonal, 

p^UpdU^ , \^)^U\iPd). (90) 

The second equation (l90|) defines the state associated 
to p with \ipd) associated to pd according to eq. ([89]) . 

This definition implies that expectation values of arbi- 
trary operators can be computed in pure states as (we use 

(A) = {^^J\A\^P) = iP^Aij 

= ^U^AU^d = (U^AU)n 

= tr (C/prfC/U) tr{pA). (91) 

We recover the standard quantum mechanics law for the 
computation of expectation values of observables in terms 
of "probability amplitudes" ip. To every two component 
complex vector ip we can associate a pure state density 
matrix by a two step procedure: (i) normalize ip hy a 
rescaling with a constant such that t/'^t/; = 1, (ii) construct 
Pa/3 — ''Paipp- In particular, an associated density matrix 
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exists for arbitrary linear combinations /3iV'i+/32V'2- An ex- 
plicit example for a classical probability distribution which 
corresponds to an entangled state within four-state quan- 
tum mechanics [l6| can be found in sect. XI. 

In quantum mechanics the transition amplitude Mab be- 
tween two pure states j-i/'a) and j-i/'b) is defined as Mab — 
("00 1 "06) I and the transition probability obeys Wab = \Mab\'^ ■ 
We will next establish that the transition probability is 
precisely the conditional probability discussed in the pre- 
ceeding section 

{wi)l = \{±a\+b)\' , {wir- - \{±a\-b)?. (92) 

Here the quantum states \±a) are the eigenstates of the 
operator A with eigenvalues ±1, 

A\+a)^\+a) . A\~a)^~\-a). (93) 

In order to show eq. ([92]) we use the completeness of the 
Hilbert space which allows the insertion of a complete set 
of states 

l = (+i3|+s) = {+b\+a){ + a\+b) + { + b\-a){-a\+b). 

(94) 

and 

{+b\A\+b) 

= {+b\A\+a){+a\+b) + {+b\A\-a){-a\+b) 

= \{+a\+b)?-\{-a\+b)\'. (95) 

From the definition of the conditional probability and eq. 
(|94| one obtains 

{<)+ = \{1 + {A)+b) = \{1 + {+b\A\+b)) 

= \{+a\+b)\\ (96) 
and similar for the other combinations in eq. (|92p . 

3. Operator product 

A third crucial ingredient for quantum statistics is the 
definition of an operator product AB and the determina- 
tion of quantum correlations, as 

Re{{AB)) = Re{tv{ABp)) = ]^tr{{A,B}p), 

Re{{ABC)) = Re{ir[A]3Cp)) = ]^iv{{A]3C + C]3A)p) 

= ltr(({{i,B},C'}+[[i,B],C])) 

= itr(({{i,B},C'} + {i,{B,C'}} 

-{B,{A,C}])p). (97) 

Since we have defined for the classical system the spin op- 
erators as 2 X 2 matrices, we can, of course, define the ma- 
trix product and compute the quantum correlations (j97|) 
for A, B, C corresponding to arbitrary spin observables. 
Beyond this formal definition of the quantum correlations 



([97)) we want to establish their close connection to the con- 
ditional correlations discussed in the preceeding section. 

For this purpose we compute the expectation value of 
the anti-commutator of two operators in an arbitrary pure 
state \s). With 

{s\Ab\s) = {s\A\+b){+b\BIs) + {s\A\-B){-B\m 

= \{+B\s)f{ + B\A\ + B) \{-B\s)f{-B\A\-B) 

+ {{+b\s){s\-b){-b\A\+b) - c.c), (98) 
one finds the conditional correlation ([69]) 

^{s\{A,B}\s) = Re{{s\AB\s)) 

= {A)+Bwf^^ - {A)^BW^^, 

= {AoB)s, (99) 

where w^ ,, — \ {±b\s)\'^. This establishes the relation ([7^ 
for any pure state density matrix. The extension to ar- 
bitrary p uses the fact that p can always be written as 
p ~ w\p\ + W2P1 with p\_i pure state density matrices and 
real w\^2 > 0, wi -|- ■u;2 = 1. With |si), |s2) the pure states 
corresponding to pi,p2, one has 

1 tr {{A,B}p) = ^{s,\{A,B}\s^) + ^{S2\{A,B}\S2) 

= {A)+Biwi\{ + B\-Sl)\^+W2\{+B\s2)\') 

^ {A)_Biwi\{-B\si)\' + W2\{~B\-S2)\^)- (100) 

Using 

i(l±tr(pB)) = wi\{±b\si)\^ + W2\{±b\s2)\^ (101) 

this shows that the r.h.s. of eq. (jlOOp coincides with the 
last eq. ([S^ . An analogous, but somewhat more lengthy 
computation establishes eq. ([77]) for the conditional three 
point correlation, and can also be used for higher correla- 
tion functions. 

4. Quantum measurements 

A fourth corner stone of quantum mechanics is a rule 
how to express the possible outcome of measurements in 
terms of expectation values of observables. Such a rule is 
needed for every theory. For our classical statistical sys- 
tem we employ a rule based on conditional probabilities for 
consecutive measurements. It is the same as in quantum 
mechanics. We have shown how the conditional correlation 
functions in classical statistics can be expressed in terms of 
quantum correlation functions. Inversely, our computation 
provides a physical interpretation of quantum correlations 
in terms of the outcome of a sequence of measurements. 
Only the real part of the expectation values of products of 
operators can be measurable quantities. From eq. (j97p we 
see how they can be related directly to conditional correla- 
tions. We note that the three point functions Re{{ABC)) 
does not simply correspond to one order of measurements 
(say first C, then B, last A), but rather to a linear com- 
bination of sequences in different orders, as given by the 
last equation in (|57|) . This is closely related to the term 
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involving commutators in eq. (|97|) . The one to one corre- 
spondence between quantum correlations and conditional 
correlations closes the proof of equivalence between quan- 
tum statistics and classical statistics with infinitely many 
micro-states. All measurable quantities can be computed 
in either approach. 

The quantum pure states and the classical pure states 
are in one to one correspondence. Both can be parame- 
terized by the coordinates on the sphere S^, as given by 
the condition Trp^ = 1 or J2k Pk — ^- From the point of 
view of the classical probability distribution {pa} the clas- 
sical pure states are sharp states with one pg equal to one 
- namely the one in the direction specified by the location 
of the state on 5^, and all other probabilities vanishing, 
Pa^a " 0- There is no statistical distribution on the level 
of {pcr}. For the pure states the statistical character of the 
system arises therefore only from the notion of probabilis- 
tic observables. Only one spin has a sharp value in a given 
pure state, namely = 1. It is the one which points in 
the direction corresponding to the location of the state on 
the sphere. All other spins have (^s)^ < 1 and therefore 
correspond to measurements with a statistical distribution 
of values -f-1 and —1. (For this counting the directions 
of the spin variables cover only half of the sphere, 5^/Z2, 
and we have omitted the trivial extension = — 1 for the 
spin opposite to the direction of the state.) In this setting 
the statistical character of quantum mechanics is genuinly 
linked to the notion of probabilistic observables. We also 
note that the notions of classical eigenstates and classical 
eigenvalues are in direct correspondence to the quantum 
mechanical definition of eigenstates and eigenvalues. 

5. Operator algebra 

Finally, quantum mechanics has the useful structure of 
linear combinations of operators, \\A-\-\2B . It is compat- 
ible with the rule for expectation values 

(AiA + AaB) = Ai(.l> +A2(B>. (102) 

We can take this construction over to the two-level ob- 
servables A, B in the classical system with infinitely many 
degrees of freedom. For real Ai^2 obeying Af -|- A2 = 1 
the combination \\A 4- A2-B is again a two-level observ- 
able - the linear combination remains a map in the space 
of two-level observables. We can define rotated spins in 
this way. (Obviously, this is no longer possible for a fi- 
nite number of micro-states, where the linear combination 
with arbitrary Ai, Af -|- A2 = 1, is no longer defined. For 
finite iV the allowed Ai have to be restricted such that al- 
lowed spins are reached by the rotation - in our example 
with two spins the rotations have to be restricted to dis- 
crete Zjv-transformations.) We may relax the condition 
A^ -I- A| = 1 by defining formally the multiplication of an 
observable by a complex number A using the replacement 
A^ \Ac, such that (AA) = A(A). For real A > this 
amounts to a change of units for the observables, replacing 
m eq. dSI) 8{x±\) 6{x±X). Multiplication with -1 
corresponds to a map of the spin to the spin with opposite 
direction on the sphere. For real A all observables remain 
two- level observables with = A^. The multiplication 



with i, or generally complex A, remains formal and is not 
related to the outcome of possible measurements. It is, 
nevertheless, a useful computational tool since it gives to 
the space of observables the structure of a complex vector 
space. This is analogous to the multiplication of quantum 
states by arbitrary complex numbers. It is needed in 
order to implement the vector-space structure of Hilbert 
space, even though physical states should be normalized, 
= 1. 

By a combination of rotations in the space of two-level 
observables with i^A?) = 1 and scalings we have defined 
arbitrary linear combinations A = ekAP^^ of the clas- 
sical "basis observables" A^^^ . These quantum observ- 
ables are represented by a complex three-component vector 
e = (ei . . . 63). The expectation values are defined in the 
classical ensemble and obey (A) = pke^ , {-^) = etCk- 
This is in one to one correspondence with the operators 
A — CkTk , A^ — CfcCfc. The hermitean conjugation of 
a classical observable A^ is defined as e%. Mea- 

surements must yield real values such that measurable ob- 
servables are (A + A^)/2. The most general operator in 
the Hilbert space of two-state quantum mechanics reads 
A — efcTfe -1- cq. Using the unit observable every oper- 
ator has its corresponding classical two-level observable 
A — efcA^*^^ + bq. The possible outcomes of individual 
measurements of A in the classical ensemble are given by 
the eigenvalues of the 2 x 2-matrix A. 

The quantum mechanical operator product AB can be 
mapped onto a "quantum product" of classical probabilis- 
tic quantum observables AB, as defined by the associated 
vector e and cq. 

er^=e[-)eP),ei--)=^e„,„e[^)eCf). (103) 

With this product we can define an algebra of probabilis- 
tic quantum observables that is isomorphic to the algebra 
of quantum operators. On the level of the probabilistic 
observables one may at first sight wonder why one should 
introduce the particular product (|103p . However, we have 
seen already how to employ the quantum product for the 
computation of the outcome of a sequence of measurements 
in terms of conditional correlations. The expectation value 
of the quantum product AB is closely related to the expec- 
tation value of the conditional product Ao B hy eq. (I72p . 
Another important use of the quantum product AB is the 
discussion of the minimal value of the product of the dis- 
persions for two observables. It can be expressed in terms 
of the commutator AB — BA by the Heisenberg uncertainty 
relation. 

6. Beyond quantum observables 

Not all possible observables in a quantum system find a 
standard description in terms of quantum operators. Here 
an "observable" refers to a property of the system whose 
value (a real number) can be measured by some suitable 
apparatus. For an "observable of the quantum system" the 
spectrum of its possible measurement values should be de- 
termined by the properties of the quantum system, and the 
probabilities for finding a value within the spectrum should 
be computable in terms of the information characterizing 
the state of the quantum system (the density matrix p). 
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As an example for a measurement apparatus we con- 
sider a sequence of two Stern-Gerlach magnets, with di- 
rections of the magnetic fields rotated by 90 degree rel- 
ative to each other. For an arbitrary polarized incom- 
ing beam the outcome wih be four beams. For dcfinite- 
ness, they correspond to the spin values of {Sz,Sx) = 
{+, +), {+, —), (—,-!-), (—,—), were Sz refers to the spin af- 
ter the first apparatus and to the spin after the second 
apparatus. We choose a two- level observable R which take 
the value V — 1 if Sz and have the same sign, and 
V = —1 for opposite signs. The probability for find- 
ing V ~ 1 can be measured by dividing the number of 
atoms in the two beams -|-) and (— , — ) by the number 
of atoms in the incoming beam, and similar for w-. Thus 
the spectrum ^ = ±1 is known and the probabilities w± 
can be measured and computed for any incoming state - 
the observable R is an observable of the quantum system. 

For our setting a quantum mechanical computation 
yields = — 1/2 for an arbitrary polarization of 
the outcoming beam. In other words, the observable R has 
a vanishing expectation value for all states of the quan- 
tum system. The only quantum operator consistent with 
this property is A = 0. However, this operator has a spec- 
trum with only one possible eigenvalue, namely zero, and 
not (+1, —1) as appropriate for our two-level observable. 
If we request that the spectrum of possible measurement 
values of an observable should correspond to the spectrum 
of eigenvalues of an associated operator, we must conclude 
that not all observables of a quantum system can be de- 
scribed by quantum operators. 

Within our setting of probabilistic observables the ob- 
servable R finds a simple place. It is given by 

R = o (104) 

and equals the random observable discussed in the pre- 
ceeding section. In a certain sense the ensemble of prob- 
abilistic observables is more complete than the ensemble 
of quantum operators, since arbitrary observables of the 
quantum system can be described. In contrast, the stan- 
dard association of operators and observables in quantum 
mechanics covers only the quantum observables, which are 
a subclass of the more general system observables. The ran- 
dom observable i? is a system observable but not a quan- 
tum observable. Of course, the concept of a larger class of 
probabilistic observables can be implemented in quantum 
mechanics just as well as in classical statistics. 



IX. SIMPLE EXAMPLE: CARTESIAN SPINS 

The basic ingredients for the reduction of a classical en- 
semble to a subsystem with quantum behavior can be un- 
derstood in a simple example with a finite number of clas- 
sical substates r. Of course, as we have seen in sect. IIVI 
rotation symmetry can no longer be realized in such a sys- 
tem. We will discuss here three cartesian spins Sx, Sy, Sz, 
while continuous rotations of these observables are not de- 
fined. 



On the substate level we consider eight substates labeled 
by T = 1,...,8 or T = (+,+,+), (+,+,-), (+,-,+), 
(+,-,-), (-,+,+), (-,+,-), (--,+), (-,-,-). The 
spin observables have fixed values in each substate 
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In a direct product basis we can represent Sx = (1 (8) 1 (8) 
Tg) , = (1 ig) Ta (g) 1) , Sz = {tz®1® 1). The classical 
ensemble is specified by the probabilities Pr > , {pr} — 
(pi, . . . ,P8) , Y^tP-^ ~ 1- classical observables of this 
system can be constructed from Sk and the classical prod- 
ucts 5*/!; -Si, Sk-SfSj. Beyond the unit observable one has a 
total number of seven independent observables, comprising 
the three spins Sk and four observables Ei — S2- S3 , E2 — 
Si'Ss , E3 = Si- S2 , E4 = Si-S2-S3. By a measurement of 
all seven expectation values (Sk) , (Ei) one can determine 
all probabilities pi, . . . ,pg. The joint probabilities for ar- 
bitrary pairs of observables Ay and Az are available (with 
Az — {Sk,Ei) and Aj. — 1) and we deal with a complete 
statistical system. 

Let us now define a subsystem with two properties: (i) 
Only the three cartesian spins Sk are system observables, 
while the observables i?i , . . . , i?4 are considered as "envi- 
ronment observables" . Measuremts in the subsystem can- 
not determine the expectation values (Ei) for the environ- 
ment observables. In turn, the information contained in 
{Ei) cannot be used for predictions of properties of the 
subsystem which do not involve the environment. Since 
the information about the joint probabilities for the spins 
Sk is directly related to the expectation values (Ei) they are 
not available in the subsystem. The subsystem is described 
by incomplete statistics, (ii) The purity of the system is 
bounded by one, P = Y^kPk < ^ , Pk = {Sk)- This re- 
stricts the most general classical probability distribution, 
as given by seven independent real numbers pi, . . . ,P7, to 
a subspace obeying the inequality -P < 1, with 

P = 3-4(3pi + 2p2 + 2p3+P4 + 2p5+P6+P7) 

+4(3p? + 2pl + 2pl +pI + 2pI +pI+ p?) 
+8{2pip2 + 2pip3 +P1P4 + 2pip5 +P1P6 +P1P7 

+P2P3 + P2Pi + P2P5 + P2PG + PzPi + P3.P5 

+P3P7 + P5P6 + PbPl)- (106) 

(For example, pi = 1 , Pr^jti = is not allowed since this 
would lead to P — 3.) The purity constraint P < 1 al- 
lows us to construct the density matrix p = {1 + {Sk)Tk)/2 
such that {Sk) = tr(S'fcyo), with Sk = Tk- The subsystem 
shows many properties of a quantum mechanical system 
for three cartesian spins, provided we use the appropriate 
conditional correlation for the prediction of the outcome of 
sequences of measurements. 
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After a first measurement has found = 1 we should 
eliminate all states for which (Sz) ^ 1- This implies for 
the state after this measurement ~ ~ pj = — 
, pi + P2 + P3 + P4: = 1- Obviously, this requirement 
is not enough to fix the state uniquely, since we still have 
three free real numbers pi,p2,p3. Different types of mea- 
surements correspond to different (j)i,p2,P3). A "classi- 
cal measurement" would be able to employ information 
concerning both the subsystem and the environment. A 
"perfect classical measurement" could retain the relative 
weights of the probabilities Pi,P2jP3jP4 as they have been 
before the measurement. The state after the measure- 
ment would then be characterized by the "classical rule" , 
according to which the three numbers pj are given by 
Pj = P'j/iPi +P2 +P3 +P4) , i = 1, 2, 3. Here p'^ denote the 
probabilities in the state before measurement. The corre- 
sponding measurement correlation equals the classical (or 
pointwise) correlation. 

This procedure is not possible, however, for a measure- 
ment that is compatible with the isolation of the sub- 
system. The information needed for the computation of 
{pi,P2,P3) after the measurement according to the classi- 
cal rule, namely p'j,j = 1, . . .4, is more than what is con- 
tained in the three numbers pk = (Sk) which characterize 
the state of the subsystem. Furthermore, the probabilities 
for the new state, computed according to the classical rule, 
would not necessary obey the purity constraint P < 1 any 
longer. A simple counter example has for the original state 
before the measurement p'^ = p'^ = Ps ~ ^/^^ with purity 
P' = 1/3, where the classical rule would imply after the 
measurement pi = 1, P = 3. 

A good measurement of pure substate properties should 
not involve environment information for the determination 
of the state of the subsystem after the measurement. In 
addition, the purity constraint P < 1 should be obeyed 
for the state after the first measurement. Therefore the 
state after the first measurement should again be charac- 
terized by three numbers pfc, with -P = X^fePfe — 1- 
we have seen already in sect. IVIIl these requirements fix 
uniquely the state after the first measurement. It must 
obey pi = P2 = , p3 = 1, since any nonzero p\ or p2, 
together with P < 1, would imply < 1 and therefore 
(5'z) < 1, in contradiction with the measurement Sz = 1- 
This "quantum rule" for the determination of {pi,p2,P3) 
after the first measurement implies p2 = P3 — 1/2 — pi. 
The classical probabilities are not fixed completely since 
Pi remains free within the interval < pi < 1/2. However, 
the undetermined part only concerns properties of the en- 
vironment. We note the relations 

P1+P2 ^Pl+P3=P2+P4= P3 + P4 = ^, (107) 

which imply that a subsequent measurement of Sx or Sy 
has equal probability to find values +1 or —1. We may 
have started before the measurement with a state where 
P2 + P4 = (as, for example, with the state p'l — p'^ = 
Pg — 1/3). Nevertheless, after the measurement one finds 
P2 + Pi = 1/2. The perfect "subsystem measurement", 
which does not involve environment properties, leads to a 



change in the relative probabilities for the classical ensem- 
ble of subsystem plus environment. (In the example with 
p'l — P'5 — P's — ^/'^ the ratio {p'2 +p'^Ip'\ — changes to 
(p2 ^ Pi) I Pi — l/2pi). In other words, a good subsystem 
measurement necessarily leaves traces in the environment. 
Typically, these traces are not recorded, however. 

Although the three cartesian spins serve as an instructive 
example for many of the crucial statistical properties of 
quantum systems, they do not reproduce all features of a 
two-state quantum system. Arbitrary linear combinations 
of quantum operators do not have corresponding classical 
observables in the ensemble with only eight states. For 
this purpose we have to extend the discussion to classical 
ensembles with infinitely many states, as we have discussed 
in sect. IIVI 

X. TIME EVOLUTION 

In this section we discuss the time evolution in the clas- 
sical statistical system. Assume that at some time t\ the 
probability distribution is \J>a\^ and at some later time t2 
it has changed to a different distribution {p^}. The observ- 
ables are kept fixed and we want to study how their expec- 
tation values change. We may define "transition probabili- 
ties" Sap such that (with summation over repeated indices) 

Pa{t2)^~Sap{t2M)Pp{tl)- (108) 

The transition matrix S^p should conserve the unit sum, 
X]o-Pct(^2) = 1- By the process of reduction of degrees of 
freedom we can associate to \j>a\ and {p^} effective proba- 
bilities for an effective three-state classical system, namely 
Pk{ti) and pk{t2), k = 1 . . . 3. The transition matrix Sa-p 
induces a reduced transition matrix Ski for the density ma- 
trix, 

Pk{t2) ^ Skl{t2,h)pi{ti). (109) 

It is related to Sap by 

EarpS.r{t,t')pAt')Pp{t')Ai'^4^ 

= eZ^) ■ ^'''^ 

For the minimal manifold of micro-states S^,a — 
(/i, /2, /a), Efe /fc = 1: the condition 

E/^'feW^l (111) 
fc 

is preserved by the transformation (|109p by construction. 
For more general manifolds of micro-states we assume that 
the condition (|llip holds for all times t. 

For the computation of expectation values for the spin 
observables and their conditional correlations at any given 
time t one needs only to know p(t). The reduced transition 
matrix Ski{t2,ii) is then sufficient for a description of the 
time evolution. We observe that many different transition 
matrices Sap are mapped onto the same Ski, such that 
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actually only a limited amount of information about Sap 
is needed. We can consider equivalence classes for Sap, 
where two transition matrices leading to the same Su are 
considered to be equivalent. Similarly, equivalence classes 
for probability distributions {pa} are characterized by pk- 
Let us introduce the concept of purity, 



For S = S we extract 



2 
Pk- 



(112) 



Then P — 1 corresponds to pure states, P < 1 to mixed 
states and P = to equipartition, where all pk — 0. 
(The equivalence class of equipartition contains the classi- 
cal equipartition state Pa = ^/N.) The most general time 
evolution of the classical system can change the purity. We 
will concentrate first on the case where the purity is con- 
served. The unitary time evolution for two-state quantum 
mechanics will follow from this simple assumption. 

1. Unitary quantum time evolution 

In fact, the conservation of the length of the vector pk 
implies that 5*^; is an orthogonal 0(3)-matrix, Ski — Ski, 
J2iSkiSmi = Skm- Arbitrary 0(3)-transformations acting 
on the Pk can be represented as unitary transformations 
acting on the density matrix p (cf. eq. (|4T|) ) as 



P{t2) - U{t2,ti)p{ti)U^t2,h). 



(113) 



This follows from the equivalence of SU{2) and 5*0(3) (up 
to a factor Z2, with p invariant under the Z2 transforma- 
tion). Parameterizing 



one finds 



(114) 

Ski = {1 ~ 2 sin^ -r)Ski+ 2 sin' J f3kPi 
-1-2 sin 7 cos 7 ekimfi 

= d'^ = ^. (115) 

7 

From eq. (|113p the time evolution in two-state-quantum 
mechanics follows in a standard way. We may consider 
infinitesimal changes of time, for which we find the von- 
Neumann equation 

^ = -i[H,p], H = t^u{t2,h)uHt2,h)^Hl (116) 

Ot Ot2 

Pure states obey then the Schrodinger equation with a her- 
mitean Hamilton operator H 



With H = HkTk we can write eq. (|116p as 
dpk 

— - = ZUiPm^lmk 

at 

and compare with the general formula 



dpk _ dSki 

dt ' dt 



(117) 



(118) 



(119) 



rr ^dS,i^ 



(120) 



which yields H in terms of S. We observe that eq. IjllSp 
is consistent with eq. (I119P only for orthogonal matrices 
S - otherwise the r.h.s. of eq. ()119p is not antisymmetric 
under the exchange of the indices k and m in the matrix 
multiplying pm, as is the r.h.s. of eq. pisp . 

The unitary transformation in quantum mechanics can 
easily be related to an appropriate time evolution of classi- 
cal probabilities on the level of micro-states. Consider first 
the minimal manifold of micro-states 5*^. It is sufficient 
that the classical time evolution of p{fs) is described by a 
rotation of the three-dimensional unit vector (/i, /2, /a) or 
a corresponding rotation of the angles (Lp,d). Let us con- 
sider a statistical system where the probability distribution 
at time ti, p{'d,ip;ti), changes at some later time ^2 to 



pi^,^;t2)^p{^',^';h), 



(121) 



where i9' — '0'{t2,ti,'d,ip) is given by a time dependent 
rotation on 5^, and similar for ip' . We can then compute 
the time dependence of the elements of the density matrix 



Pk{t') = / dnp{^,^;tk)fk{^,v) 



dil p{d\^'-M)fk{'d,^) 



(122) 



dn' p{d',^',h)fk{^{^',v'),^{^',^')), 



where '&{■&', (p' , t2, ti) expresses the "fixed angle" i9 in terms 
of the "rotating angle" . Since fk is a unit vector on S' 
one has 

fkmd','p'),^{^',^')) = Ski{t2,ti)fi{^',ip'), (123) 

with Ski an orthogonal matrix depending on time. Inser- 
tion into eq. (|122p yields 

Pk{t2) - Ski{t2,h) J dn' p{§',^'-M)fi{^',^') 

= Ski{t2,h)pi{h) (124) 

As we have shown above, the orthogonal transformation 
(I109P results in the unitary quantum evolution (|116p , (|117p . 
The generalization to extend manifolds of micro-states or 
to substates is straightforward. If the states of the ex- 
tended manifold are characterized by (i?, ip) and additional 
parameters a (which are assumed to be invariant under 
rotations) the probability distribution p{'d, ip, a, t) has to 
change according to eq. (|12ip . with a kept fixed. 

2. Decoherence and syncoherence 

We next consider the general case of the evolution (|109p 
where 5*^; is not necessarily an orthogonal matrix. An ar- 
bitrary change of the vector pk can be written as a com- 
bination of an orthogonal transformation and a scaling, 
Ski — Skid. This adds to eq. (|116p a scaling part 



%^~m,pHD(p-\ 



D 



d\nd 
dt ' 



(125) 
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For negative D the density matrix will approach equipar- 
tition, /9 = i , pfc = 0, as time increases. This describes 
decoherence of a quantum system. For positive D the pu- 
rity tends to increase 

dtP = 2D P. (126) 

For any arbitrary distribution {p^} a classical pure state 
has the maximum possible purity, trp^ — 1 . For positive D 
the system has therefore a tendency to reach a pure state 
for large time. 

In general, Ski rnay depend on pk, and this also holds for 
H and D. The standard linear time evolution of quantum 
mechanics obtains only in the limit where H is independent 
of p and D vanishes. If D depends on pk, it will itself 
depend on time and we may write on effective evolution 
equation 

^D = /?c(pfe,D). (127) 

For trp^ = 1 a positive value of D is forbidden by the gen- 
eral properties of the probability distribution. Indeed, for 
the minimal manifold of micro-states S*^ a pure state has 
the maximal possible value trp^ = 1 - a pure state can- 
not get purer than pure. This follows directly from the 
definition of pk, cf. eqs. ([55]) . piip and generalizes to ex- 
tended manifolds whenever it is possible to project them 
on such that eq. (jllip holds. We consider here only this 
type of systems. A positive value D > for P = 1 would 
then imply a further increase of P, which is excluded. If 
in the vicinity of pure states D is positive for ensembles 
with P — pkPk < 1, we conclude that must have a 
zero for = and P = 1. If this fixed point is attractive 
for increasing t, a pure state will be approached asymp- 
totically. Unitarity of the time evolution is then a simple 
consequence of the system approaching this fixed point for 
large t. We call this approach to a pure state syncoherence. 
A physical example is a mixed state of a an atom involving 
different energy levels. Due to radiative decay it can ex- 
change energy with its environment and may finally end in 
the ground state. This is a pure state if the ground state 
is not degenerate, and pure states may also be reached for 
the degenerate case by appropriate "experimental prepara- 
tion" . 

The existence of fixed points for P = and P = 1 is 
quite generic. The precise form of approach depends, of 
course, on the system. If admits a Taylor expansion for 
the fixed point at P = 1 and D = 0, the lowest order terms 
are 

(3D = -aD + b{l-P), (128) 

where the coefficients may depend on p^ / yTrp^ and ■ 
In the vicinity of the fixed point and for approximately 
constant a and b eq. (jl28p implies an exponential approach 
to the pure state, 

1-P = xie""!* -|-a;2e-'^*, 

D = eixie~''* + 62X26-'^', (129) 

£1^2 = i(a± Va2-46), (130) 



provided a > , < 6 < 

3. Hamiltonian quantum evolution 

If the fixed point with Z? = is approached for a suf- 
ficiently large time, we will encounter the standard linear 
unitary time evolution of quantum mechanics if H becomes 
independent of pk at the fixed point. Otherwise, the system 
would be attracted to a unitary, but non-linear extended 
version of quantum mechanics - a possibility that is highly 
interesting in its own right. We should note, however, that 
symmetries may enforce linear quantum mechanics. For 
example, if SO(3) symmetry is realized at the fixed point, 
the Hamiltonian can depend on pk only via the invariant 
PkPk- This approaches a constant, and therefore the fixed 
point value of H has to be independent of pk- 

The case where H becomes independent of pk at the 
fixed point for P = 1 seems rather generic. It corresponds 
to the Hamiltonian evolution of quantum mechanical pure 
states (|117l) . For H independent of pk we may write 

H = Y,Hkrk+Ho, (131) 

k 

where the coefficients Hk , Hq do not depend on the quan- 
tum state or on t. Therefore H can be associated with 
an observable of the system (HO)) . Since H generates the 
time translation we infer from Noether's theorem that this 
observable is the conserved energy of the system. This 
demonstrates how the pure state fixed point is related to 
the isolation of the system from its environment in the 
sense that no energy is exchanged. 

In summary of this section, we find that the quan- 
tum mechanical time evolution can emerge naturally from 
a large class of time evolving probabilities Pcr{t) (|108p . 
The reduction to the time evolution of the density ma- 
trix (jl25p is always possible. Generic time evolutions may 
be attracted either to quantum mechanical equipartition, 
Pk ~ 0, or to a pure quantum state. The asymptotic ap- 
proach to the pure quantum state fixed point could provide 
an explanation why we can observe so many quantum sys- 
tems in nature. Indeed, if we consider our system as a 
subsystem of a much larger system, the time evolution of 
the subsystem may allow for dissipation of energy into the 
larger system. Quite often, the lowest energy state is a pure 
state which may be approached for large time. A mixed 
state of atoms in various energy levels will after some time 
be found in the pure ground state if energy can be dissi- 
pated by radiation. 

XI. PSEUDO QUANTUM SYSTEMS 

We have seen how quantum mechanics arises from clas- 
sical statistics in the limit of infinitely many micro-states, 
if probabilistic observables and conditional correlations are 
considered and the time evolution conserves purity. It is 
interesting to ask if "approximate quantum behavior" can 
be observed if the number N of micro-states remains finite. 
The investigation of systems with finite N may also be rel- 
evant for practical computations of quantum systems, in 
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the sense that one may consider a series with increasing N 
and take the hmit N —^ oo for which all quantities should 
converge to the quantities in the quantum system. We will 
call a classical statistical system with finite N a "pseudo 
quantum system" if it fulfills the following criteria: 

(i) There are N micro-states labeled by unit vectors 
{fk),k — 1,2,3, J2kfk — 1' with probabilities 
Pa =p{fk)- 

(ii) A group of discrete symmetry transformations Gn 
acts in the space of fk- It is a subgroup of 5*0(3) 
and converges to 5*0(3) in the limit N oo. (For 
the concrete example in sect. 3 this subgroup is Zn, 
but we consider here more general cases and three 
dimensional discrete rotations.) 

(iii) One considers N two-level probabilistic observables, 
labeled by unit vectors (e/c), J2k^k — 1' i-*^- ^(^fe)- 
The symmetry group Gn also acts on e^, such that 
the scalar product J2k ^kfk is invariant. The mean 
values in the micro-states cr = fk are given by 

^(efc) = ^/feefc, (132) 

k 

and the expectation values read 

(A(efe)> = ^p(A)^(Aefc). (133) 

{fk} k 

If we consider conditional correlations, these pseudo- 
quantum systems will converge to two-state quantum me- 
chanics for N — !■ oo. (Generalizations for quantum me- 
chanics with more than two states are possible, but will 
not be considered in this section.) 

We want to understand the differences between the 
pseudo quantum systems and quantum mechanics. For 
this purpose we first perform the reduction of the degrees 
of freedom to three effective micro-states, with effective 
probabilities pk- This reduction should keep the expecta- 
tion values of all observables {A{ek)) unchanged. It can be 
achieved by 

Pfc-^p(/fc)/fc, (134) 
{fk} 

where the sum is over all micro-states. This guarantees 
that the expectation values of all spins can indeed be writ- 
ten as 

(A(efe)) -^PfcBfc, (135) 

k 

verifying eq. (|38p. We observe that the expression (|135p 
has no ambiguity and does not depend on which effective 
micro-state is selected while the others are integrated out. 

At this point the only difference to quantum mechanics is 
the restricted range of fk, which results is a restricted range 
of Pk- This range has the geometry of a (three-dimensional) 
polygene with N corners, where the corners are given by 



the vectors fk and correspond to the classical pure states. 
It approaches the sphere in the limit A'^ — > oo as a result 
of S'0(3)-symmetry. Thus the limiting 50 (3)-symmetry 
guarantees that quantum mechanics is reached in the limit 
N —^ oo. The conditional correlations are defined for the 
pseudo quantum system just as for the quantum system. 
(The only difference may be a restricted number of observ- 
ables A{ek)-) The formalism of quantum mechanics can 
be applied to pseudo quantum systems, with the only re- 
striction that the range of pk and therefore the number of 
pure states jV') is restricted - there are precisely A^ different 
pure states {ip) instead of a continuum. Also the number 
of observables may change from the continuous family of 
spins to a finite number A{ek), but this is not necessary. 

These differences are necessarily reflected in the time 
evolution. For pure states, a unitary evolution is only pos- 
sible for discrete steps , corresponding to the allowed dis- 
crete symmetry transformations of the group Gjv- Then 
the Hamilton operator becomes the transfer matrix. Al- 
ternatively, one may consider a continuous time evolution 
which does not respect the conservation of purity, such that 
P = J2k Pk ^ 1 ^'^^ times in the interval between the dis- 
crete time steps for which a pure state is transformed into 
another pure state, Ti <t < r^+i. Unitarity is violated for 
these intermediate times, but restored whenever t reaches 
Ti. It is therefore maintained in the average for long enough 
time in units of r^+i — Ti. 

Pseudo quantum systems can only occur if the continu- 
ous symmetry 50(3) is violated and reduced to a discrete 
subgroup Gm- Inversely, a classical statistical system with 
5'0(3)-symmetry has necessarily infinitely many micro- 
states. Quantum mechanics arises whenever the time evo- 
lution of classical probabilities can be described by 50(3)- 
rotations, provided the appropriate two-level operators and 
conditional correlations are considered. In this sense it is 
not a very special situation within classical statistics. We 
emphasize that the 50(3) rotations do not necessarily re- 
flect the rotations in physical space, but may be more ab- 
stract isospin-type rotations. It is not necessary that the 
system is 50(3)-symmetric. Rather it is sufficient that the 
time evolution describes a continuous trajectory on 5^. For 
example, the trajectories may be [/(l)-rotations, as for the 
quantum mechanics of a spin in a homogeneous magnetic 
field. Continuous rotations can also arise if the Hamilto- 
nian has no continuous symmetry at all. 



XII. REALIZATIONS OF PROBABILISTIC 
OBSERVABLES 

Probabilistic observables play an important role in the 
derivation of the laws of quantum mechanics from classical 
statistics presented in this paper. Two attitudes towards 
this concept are possible. One takes probabilistic observ- 
ables as the basic concept. It may be motivated by the 
assumption that the description of reality is genuinly prob- 
abilistic. If the state of the world can only be described by 
probabilistic concepts, it seems natural that the basic no- 
tion of an observable should also be probabilistic. Taking 
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this attitude, the probabihstic character of an observable 
is the genuine situation. Classical observables that take 
a sharp value in all micro-states of the system are then a 
special case, corresponding to an idealization. 

As an alternative, one may also follow an approach where 
classical observables are the basic objects. Probabilistic 
observables are then an effective concept that arises if sev- 
eral states are grouped together into a new intermediate 
state, which may then be treated as a micro-state on a 
higher level. This approach resembles the familiar concept 
of block spins. In this section we compare both concepts 
in our setting where quantum physics arises from classi- 
cal statistics. We emphasize that our description of the 
two-state quantum system does not depend on how the 
probabilistic observables are implemented - either as "fun- 
damental" or a "composite" objects. 



1. Realization as classical observables 

We start with the implementation in terms of classical 
observables where the probabilistic observables appear as 
composite objects. In this case the micro-states of this pa- 
per correspond to the intermediate states. They are com- 
posed of substates, i.e. the "true microscopic states" for 
which the observables take fixed values. We have already 
briefly alluded to this concept in sects. II, III. Consider the 
spin observable A'-^^ or A{ek) = ^(1,0,0). bmce m every 
micro-state ( fk) it is characterized by relative probabilities 
for values ±1, one needs a classical observable which can 
only take either the values -1-1 or —1 for any substate. One 
therefore needs at least two substates for any micro-state 
with Ajj^{l, 0, 0) ^ ±1. The mean value in the micro-state 
{fk), ^/fe(liO,0), is then given by the relative probabil- 
ities of the two substates. For a given fk these relative 
probabilities are fixed "once and forever" . Our setting and 
the quantum mechanical time evolution do not describe 
situations where these relative probabilities between the 
substates change. 

At this point we have derived the composite proba- 
bilistic observable A(1,0,0) from a classical observable 

^'^'^^(1,0,0) which takes respectively the values A^^'' = 

(c) 

+1 in one of the substates, and A)- — —1 in the 
other one. Denoting the relative probabilities of the two 
substates of the state {fk) with w+{fk),w^{fk) = 1 — 
w+{fk), the probabilities of the substates are given by 
P+{fk) = p{fk)w+{fk) and p-{fk) = p{fk)w-{fk), and 
the mean value of A(1,0,0) in the micro-state {fk) reads 
A/^(1,0,0) = w+{fk) - w^{fk)- For the opposite spin, 
A(-1,0,0), one finds % (-1,0,0) - w^{fk)~w+{fk). 

We next add a second two-level observable A'^^ = 
A(0, 1,0). Since the relative probabilities w±(/fe) are al- 
ready fixed by the mean values (1,0,0), we need a 
furher classical observable A^'-''> that again takes values -1-1 
or —1. Each substate needed for a description of A{1, 0, 0) 
has to be divided again into two further substates, such 
that each state {fk) has now four substates. This pro- 
cess continues if we add the "diagonal spins" A (^-^, 

etc.. For N two-level-observables (counting A(1,0,0) and 
A{~1, 0, 0) separately) one needs A^/2 classical observables 



and 2^/^ substates for every state {fk)- 
More formally, the possible states of the ensemble can be 
characterized by (/fe; {7(5^;)}) , where fk £ , gk £ S'V^2 
(using ^{—gk) = ~l{gk)) and 'y{gk) — ±1 associates to 
every direction gk a separate discrete variable. The proba- 
bilities of these states read (in a discrete notation for finite 
N) 



p{fk;{i{gk)}) = n [\{^ + i^9k)fkgk)]p{fk). 



(136) 



All observables A{ek) have a fixed value -1-1 or —1 in every 
state, given by 7(6^). (In other words, the observable A{ek) 
picks out a specific 'y{gk = Gk) and is independent of all 
l{gk 7^ &k) ) Integrating out the substates yields 



E p{h-di{gk)}) ^P{fk) 



(137) 



{7(fc)} 



and 



%(efc) = J2 P{fkAl{gk)})l{ek) 

{f{9k)} 

'I 



E 

7(efc)=±l 



[1 +j{ek)fkek 



(138) 

7(efc) = /fcCfc, 



such that one recovers the micro-states fk and the proba- 
bilistic observables at an intermediate level. In principle, 
one could try to realize this situation by a "hidden vari- 
able theory". For N 00 this would involve infinitely 
many discrete variables ^{gk) plus two continuous angular 
variables which take values on 5^ (i.e. fk). Some law (de- 
terministic or not) would have to reproduce the probability 
distribution (|136p for finding the values {fk',j{gk)) of the 
hidden variables. 

While such a description of probabilistic observables in 
terms of classical observables is possible, it needs for large 
N a very high number of subststates. In consequence, one 
encounters a very high degree of redundancy of the descrip- 
tion by unobservable quantities. In addition, the fact that 
the relative probabilities for the substates do not change 
in the course of the time evolution may need some expla- 
nation. Part of the complexity arises in this case from our 
use of microstates with fixed distributions for the proba- 
bilistic observables. Omitting the microstates one can con- 
struct much simpler classical statistical ensembles that re- 
alize two-state quantum mechanics. An explicit example 
for a classical statistical ensemble that describes two-state 
quantum mechanics together with its environment can be 
found in [Tsf. 

2. Fundamental probabilistic observables 

Alternatively, we may consider the notion of probabilis- 
tic observables as fundamental. We may still formulate the 
probabilistic observables in terms of a "basic observable" 
which takes values B = ±1. However, one such observ- 
able will now be sufficient for a description of all A{ek)- 
As a fundamental object, a probabilistic observable is de- 
fined by the relative probabilities 'w± {fk) to observe B = 1 
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or i? = —1 in a given state (/fc). Different probabilities 
w±{fk) simply define different probabilistic observables. 
Instead of considering one fixed value of w±{fk), a change 
of the relative probability for a given (/fc) describes now 
the change from one observable to another. 

The required relative probabilities w+{fk) are easily 
computed for all two level observables A{ek) as 

w+{h;e,) = ^{l + Af,{ek))=l{l + Y.hek)- (139) 

fc 

We may still introduce two substates (/^) and /^T) for each 
micro-state (/fc), and consider i? as a classical observable 
that takes the value i? = 1 for all substates (f^) and B — 
— 1 for all substates (fj^)- However, the relative probability 
with which the substates are counted depends now on the 
observable A{ek) according to eq. (|139p . This dependence 
on efc appears manifestly in the expectation values 

(A(efc)) = J2 lPeMk,l), 

{/fc}7=±l 

PeMk,l) = ^(l + 7E^^'=M/fc)- (140) 
fc 

Here Pe^{fk,j) is the effective probability with which the 
possible values of B, namely 7 — ±1, are counted for every 
substate. It obeys p{fk,l) > and J2{f^}J2'yP{fk,l) = 
1. However, p depends now on the observable, i.e. on 
Cfc. This is a major difference from the usual setting in 
classical statistics. It reflects the probabilistic nature of 
the observables, where part of the probability information 
is used for the definition of the observable - in our case 
the relative substate probability (|139p . We note in this 
context that eqs. p39p . (|140p define positive semidefinite 
probabilities for e\ = \. The scaling of observables is 
achieved in this formulation by a scaling of i?, i.e. by a 
multiplication of the first cq. (|139p by A. 

For fundamental probabilistic observables the correspon- 
dence between classical statistics entities and quantum me- 
chanical objects becomes quite close. The basic variable B 
in classical statistics can only take the values ±1, corre- 
sponding to the eigenvalues of the normalized spin oper- 
ators in quantum mechanics and therefore to the possible 
outcome of a single measurement of the observables. The 
continuum of classical pure states on 5^ corresponds to the 
continuum of quantum mechanical pure states. The con- 
tinuum of classical spin observables Aiek) corresponds to 
the continuum of normalized spin operators in two-state 
quantum mechanics. In the classical statistics setting the 
mixed states are described at this stage by infinitely many 
probabilities p{fk), while the density matrix p in quan- 
tum mechanics needs only one probability w to decompose 
p = wp'-^^^ -|-(1 — w)p*'^^ into two pure state density matrices 
and p^'^\ In this respect, classical statistics remains 
redundant. It describes quantities p{fk) that cannot be 
determined by measurements of the two-level observables 
A{ek)- The redundancy can be removed by integrating over 



the micro-states using eq. (I134[) 

PeAPk,l) ^■^i'i^+lYpkek)- (141) 

k 

The formula (|14ip permits also a different interpretation. 
One may consider B as the true observable of the system, 
with discrete values -|-1 or —1 in the two "basic states". 
The basic states are further characterized by "external 
properties" , namely the "state of the atom" labeld by 
Pk = y/Pfk , and the "measurement orientation" , labeled 
by Cfc. Thus a basic state can be parameterized by four an- 
gles, the purity and one discrete variable (/fc, efc, P; 7), with 
/fc € ^^Cfc e 5*7^2, P e [0,1], 7 e Z2. The probabilities 
for the two basic states obey 

p{fk,ek,P;i) ^Ci^ + jVp^ fkEk) = ^i^ + l^Pkek), 

fc fc 

(142) 

such that actually only the relative angle ip between the 
atom-polarization and the apparatus orientation matters, 
i.e. /fcgfc — cos Lp. One has obviously 

J2 p(/fc,efc,P;7) = l , 0<p(/fc,efc,P;7)<l, (143) 

7=±1 

(P) = X! ^P(fk, ek,P; 7) = f'^^'^ " X! P^^^- 

7=±1 fc fc 

This point of view reflects precisely the setting of the 
Stern-Gerlach experiment, which splits an incoming polar- 
ized atom beam into two beams with different directions, 
corresponding to B = ±1. The probability of finding an 
atom in the P = 1 direction only depends on the angle 
between the polarization and the inhomogeneous magnetic 
field of the apparatus, as given by J^kfk^k, and on the 
degree of polarization, as given by P. It obtains from eq. 
(|142p with 7=1. In this setting the time evolution of the 
"atom state" is described by the deterministic evolution 
equation for the "external parameters" pk ■ The shift in the 
point of view as compared to the probabilistic observables 
A(efc) consists in attributing the information contained in 
efc to the basic state, rather than to the observable. The 
price for the simplicity of this picture is, of course, the ex- 
plicit appearance of the "measurement orientation" in the 
relative probability for the P = 1 and P = — 1 states. The 
probability of finding P = 1 or P = — 1 not only depends 
on the state of the atom, but also on the state of the ap- 
paratus used for the measurements. 

In the formulation with substates, a given substate 
(/fc,{7(efc)}) contains simultaneously the information 
about the values of infinitely many observables A{ek). The 
orientation of the measurement apparatus then decides 
which one of the A{ek) measured. In contrast, the pic- 
ture with basis states has only one "basis observable" P. 
One has to specify the condition under which it can be 
measured through the measurement orientation e^. No 
apparatus must actually be present - eq. (|142p defines the 
outcome for all possible measurement directions. For an 
actual measurement, the orientation of the apparatus then 
decides which one of the efc applies for the given measure- 
ment situation. 
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XIII. FOUR STATE QUANTUM SYSTEM AND 
ENTANGLEMENT 

Most of the conceptual issues how quantum mechan- 
ics emerges from a classical statistical setting can be de- 
scribed in the simplest system which corresponds to two 
state quantum mechanics. For keeping the discussion as 
simple as possible we have so far concentrated on this sys- 
tem. Certain important features of quantum mechanics, 
as the phenomenon of entanglement, are visible, however, 
only in more complex systems, as four state quantum me- 
chanics. This also applies to the inconsistencies of certain 
approaches to a classical implementation of quantum me- 
chanics, which become apparent by applying the Kochen- 
Specker theorem. They only arise if three or more quantum 
states are involved. (No such inconsistency arises in our ap- 
proach.) A classical statistical ensemble which corresponds 
to four state quantum mechanics and the phenomenon of 
entanglement have been discussed in ref. [Igj . For the sake 
of completeness of this paper we resume in this section cer- 
tain key features of this work. 

1. Four-state quantum system 

Let us again consider a classical statistical ensemble, 
with microstates a labeled by a number of real parame- 
ters fk and probabilities pa = p{fk) > , J^Pifk) = 1- 
The manifold of micro-states parameterized by fk is now 
different from S*^ and will be specified below. We some- 
times employ a discrete language corresponding to a finite 
number of values for fk , such that a "classical pure state" 
has p{fk) = 1 for one particular microstate fk, while p van- 
ishes for all other microstates. It is understood that we take 
a continuum limit with an infinite number of micro-states 
and a continuous vector / = (/i , /2 • ■ • ) ■ Also the class of 
possible quantum observables will be extended beyond the 
two-level observables. Four state quantum mechanics has 
three linearly independent commuting operators and we 
will see that they correspond to the possibility to measure 
more than one bit simultaneously. 

We concentrate on possible measurements that can only 
resolve two bits. (In a quantum language this corresponds 
to two spins that can only have the values up or down.) 
For any individual measurement, the measurement-device 
or apparatus can only take the values -f-1 or — 1 for bit 1 and 
the same for bit 2. In total there are four possible outcomes 
of an individual measurement, i.e. (++),(H — ),( — h) and 
( ). We describe measurements of one bit again by two- 
level observables A that are characterized by the probabil- 
ities 'w^\fk) and w[f\fk) = 1 — w^+^\fk) to find a value 
or —1 in any given microstate fk- As before, the mean 
value of ^ in a microstate fk reads 

A{fk) = wi^\fk)~wi^\fk), (144) 
and the ensemble average obeys 

(A)=J2 p{fk)A{fk). (145) 

We concentrate first on three such "two-level observ- 
ables", namely Ti for the measurement of bit 1, T2 for 



the measurement of bit 2, and T3 for the product of bit 

1 and bit 2. Denoting by W++,W+-,W-+ and W 

the probabilities to measure in the ensemble the outcomes 

(++), (+-), (-+) and ( ), one has 

(Ti) = W++ + W+- - W-+ - W— 
{T2) = W++ ~ W+- + W-+ - W— 
in) = W++ -W+- -W-+ + W—, (146) 

such that W++ etc. can be found from the average values 
of the three observables (T„j). For a classical eigenstate of 
the observable Ti with eigenvalue (Ti) ~ 1 the probability 
for all states with Ti{fk) < 1 must vanish. Such a pure 
state leads toW-+ = W—=0. 

Let us now specify our system. For the manifold 
of all microstates we choose the homogeneous space 
SU{4:)/SU{3) X U{1). We parameterize the embedding 
space R^^ by the 15 components fk of a vector {k = 
1...15). It is normalized according to '^kfk ~ 3' 
obeys eight additional constraints that reduce the indepen- 
dent coordinates to six, as appropriate for the dimension 
of SU {4) / SU {i) X U{1). An easy way to obtain the con- 
straints for fk employs a hermitean 4x4 matrix p, 

~p=\{l^fkLk),fk^ir{~pLk). (147) 

(Summation over repeated indices is always implied.) Here 
Lk are fifteen 4x4 matrices obeying 

L2 = 1 , trife = , iv[LkLi) = ASki. (148) 

They read explicitly (with the Pauli 2x2 matrices) 

Li = diag(l,l,-l,-l) , L2 = diag(l,-l,l,-l) , 

L3 - diag(l, -1, -1, 1), U=(^ ^ , (149) 

with Lg, Lg, Lio, Lii obtained from (L4, L^, Lq, Lj) by ex- 
changing the second and third rows and columns, and 
L12, L13, Z/14, Z/15 similarly by exchange of the second and 
fourth rows and columns. The matrix p parameterizes the 
homogeneous space SU{4)/SU{3) x U{1) if it obeys 

p = UpiU'' , UU'' = U^U =1 , pi= diag(l, 0, 0, 0), 

(150) 

for some appropriate unitary matrix U . This implies 

P^ = P , Pia>0 , V Paa= trp = 1. (151) 
^ — 

The observables Ti,2,3 are specified by Tm{fk) = 
fm , "Ti = 1,2,3, which is equivalent to the specifica- 
tion of w^'" (/fc) in eq. (|144p . Already at this stage 
we get a glance on the possibility of entanglement, since 
pure states with /i = /2 = , /3 = — 1 will lead to 
(Ti) = (T2) = 0, (Tg) = -1, and therefore to a cor- 
relation for opposite values of bit 1 and bit 2, W++ — 
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W— = , W+_ = W-+ = \. We label these two- 
level observables by a real vector with components , with 
&kiTm) = Skm, m — 1 . . .3 , k — 1 . . .15. The mean value 
of Tm in a given microstate fk can then be written in the 
form (with Ck = ekiTm)) 

TM) = fkek. (152) 

Similar to eq. (|40|) we represent an observable A{ek), la- 
beled by Ck, in terms of a hermitean operator 

A = efeLfe , ek{A) = itr(iLfc), (153) 

with X]fc ^fc = 1 for A^ = 1. In this language one finds 

We define a density matrix by 

P-7(1 + Pfeife), Pk^y^.^ Mfk)fk- (154) 

Eq. (|145p yields the familiar quantum law for expectation 
values 

(r„,) = tr(f„.p). (155) 

Much of the details of the classical probability distribution 
for mixed states cannot be resolved by measurements of 
(Tm) - only the entries of the density matrix pk matter. In 
contrast, for classical pure states only one microstate fk 
contributes, with Pk = fk , P = P, and therefore p^ = p as 
appropriate for a quantum pure state density matrix. 

The description of pure states in terms of wave functions 
?/^Q, , i/)^?/; = 1, can be obtained from the density matrix in 
a standard way, pap = iia% , i^a = Uap{^i)f3 , {ipm)a = 
5ma- This expresses the fk as a quadratic form in the 
complex four- vector -0^, 

fk = i^^LkiJ , {A) = V^V', (156) 

and shows directly that only six components of fk are in- 
dependent. The quantum mechanical wave function ip ap- 
pears here as a convenient way to parameterize the mani- 
fold of microstates in classical statistics. The classical pure 
states are in one to one correspondence to the quantum 
pure states. For a pure state the purity pkPk (|112p equals 
three. 

2. Entanglement 

Let us next discuss a classical ensemble that realizes a 
typical entangled quantum state. We consider the wave 
functions 

V'± = -^(02±V'3), (157) 

with associated pure state density matrices p±. These 
states are eigenstates to T3 with eigenvalue —1. Writing 

p± - p± = i (1 - L3 ± (L12 - L14)) , (158) 

we infer for the corresponding classical pure state f^ = 
— 1 , /12 = ±1 , /i4 = =f1, and all other fk vanishing. Thus 



(Ti) = (T2) = implies that the values of bit 1 and bit 2 are 
randomly distributed, with equal probabilities to find -|-1 
or —1. Nevertheless due to {T3) — —1, the product of both 
bits has a fixed value. Whenever bit 1 is measured to be 
positive, one is certain that a measurement of bit 2 yields a 
negative value, and vice versa. The two bits are maximally 
anticorrelated. We denote the conditional probability to 
find a value e for bit 2 if bit 1 has been measured to have a 
value 7 by p(e; 7). For our entangled state it obeysp(l; 1) = 
p(-l; -1) = , p(l; -1) = p(-l; 1) = 1. We will see below 
that for a typical entangled state further observables are 
strongly correlated or anticorrelated. 

Beyond ri,2,3 we consider a set of fifteen basis observ- 
ables Tfe, k — 1 ... 15. They are all two- level observables 
with spectrum ±1, specified by the mean value in a mi- 
crostate fk 

T„,{fk) = /„, , m = 1 . . . 15. (159) 
The ensemble averages of the basis observables 

(T„) = Pifk)fM) = E Pih)fn. - Pm (160) 

{fk} {fk} 

characterize the quantum state, cf. eq. (I154p . We can 
generalize eq. (I152p for arbitrary m, with ek{Tm) = Skm, 
and obtain Lk as the quantum operators associated to Tk 
byeq. (fT53)l . 

Let us now describe the measurement of two spin ob- 
servables with a relative rotation in the entangled state 
given by p^ ()158|) . A rotated first spin observable A{-d) 
has the associated operator A{d) = cosSLi + sini^Lg, 
while a rotated second spin observable B{ifi) is associated 
to B{ip) = cosLpL2 -t- sin(y5i4. This is most easily seen in a 
direct product basis where 

Li = (t3 «) 1) , L2 = {1® T3) , L3 = (t3 ® r3), (161) 

= (n ®1) , L4 = (1 (g) Ti) , L12 ^ (ri ® ri), 
Lfy = (t3 ® Ti) , Lio = (ti (g) r3) , Li4 = -(t2 ® T2). 

In this basis the entangled state density matrix p_ (|158p 
takes the intuitive form 

p- = ^{l- (ri ® Ti) - {t2 ® T2) - (t3 (E) T3)). (162) 

All three spin components are maximally anticorrelated. 

As we have discussed in sect. IVIII extensivelv. the prod- 
uct of two measurements of two spin observables is given 
by the conditional quantum correlation 

= ^tr{{A,B}p). (163) 

Here the conditional probabilities are evaluated for mini- 
mall y d estructive measurements - details can be found in 
ref. [3, [3] ■ For a general state one finds for the rotated 
spins the correlation 

{A{{))B{^p)),n = cos ?9 COS (^/93 + cos I? sin (^p6 (164) 
+ sin cos ipp 10 + sin sin ipp 12. 
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For the entangled state p_ (|158p one has ps = pi2 = — 1, 
Pe — Pio — and therefore 

{A{d)B{ip))„-, = -cos{d - (^) = C.ni'd - <p). (165) 

This is the same correlation as for quantum mechanics. 

In contrast, we may consider the "classical correlation 
function" 

{A{d)-B{^)) ^C,i{d~^), (166) 

which could be defined if the probabilistic observables are 
realized as classical observables on the substate level, and if 
the conditional probabilities W^^ , W^^ in eqs. ([55)1 . ([55]) 
are replaced by the "classical probabilities" W^^ = W^^. 
In this case one can show Bell's inequality, which reads for 
our situation 

\Cci{'&i) - C,i{d2)\ < 1 + CM - d2). (167) 

The classical correlation (jl66l) and the conditional correla- 
tion (|165p are clearly different. Replacing Cd in eq. (|167[) 
by Cm as given by eq. (|165p . one finds that the inequality 
is violated for a range of angles, for example for di = 7r/2, 
i?2 = 7r/4. This demonstrates again the crucial importance 
of the use of the appropriate correlation function for the 
description of the outcome of two measurements. 

3. Interference 

Other interesting quantum phenomena are the superpo- 
sition of states and interference. Consider two pure quan- 
tum states that evolve in time according to 

V-a = -^(^l-HV^2)e-*"»* , i>b = -^(^^1-^^2)6-^""*. (168) 

The corresponding density matrices are time independent, 
Pa,b — (l-f Li ± L4 ± Lq)/4,. Both states describe an eigen- 
state of the first bit, (Ti) = 1, whereas the second bit is 
randomly distributed, (T2) = 0. Due to the time depen- 
dent phase, the interference can be positive or negative for 
the superposition of both states, ip = '^i'^a + 4'b)- A 
quantum mechanical computation leads to a characteristic 
oscillation of {T2), 

(T2) ^ ^P^L2lJj - COs(At) , A = UJa-LUb, (169) 

as known from the oscillation of a spin in the z-direction for 
a superposition of spin-eigenstates in the a;-direction, with 
different energies for the positive and negative Sx eigen- 
values. A classical rotation ^2 = h = cos(At) reproduces 
the "interference pattern" p69p . An evolution law dtj2 = 

A/s , dth = -A/2 , h = f2, f7 = f5 : fl = l : fk=0 

otherwise, has solutions leading to a density matrix which 
corresponds to the superposed state ip, 

p = i|l -I- Li -I- cos(At)(L2 + L3) - sin(Ai)(L5 -I- Lt)}- 

(170) 

A classical statistical time evolution can yield the same de- 
pendence of expectation values as the quantum mechanical 
interference pattern. 



4. Fermions and bosons 

Our classical statistics setting can also describe identical 
bosons or fermions. We may identify the two bits with two 
particles that can have spin up or down, 

V'l = I TT> , V^2 = I Ti) , Va3 = lit) , V^4 = Ui). (171) 

If the particles are identical, no distinction between bit 1 
and bit 2 should be possible. This requires that the sys- 
tem must be symmetric under the exchange of the two bits, 
imposing restrictions on the allowed probability distribu- 
tions p(/fe). The symmetry transformation corresponds to 
an exchange of the second and third rows and columns 
of p. On the level of the fk this amounts to a mapping 
fk-> f'k ■ /i ^ /2 , /4 ^ /s , /s ^ /g , /e ^ fio , /? ^ 
fii , /i3 <-> /i5, while /a, /12 and /14 remain invariant. Al- 
lowed probability distributions must obey p(/fc) = p{fk)- 
In particular, the allowed pure states are restricted by 

fi = f2 , f4 ^ fs , fb — fd , fe = /lo , f? = fll, 
and /i3 = /15. 

Consider the pure states ^+ and tp^ in eq. (|157p . For 
both states the density matrix p± is compatible with the 
symmetry. This does not hold for the density matrices 
corresponding to the states ip2 or ip^. In fact, linear super- 
positions of ?/'+ and ■)/'_ are forbidden by the symmetry, a 
pure state aip- + must have a = or 6 = 0. The sym- 
metry requirement acts as a "superselection rule" for the 
allowed pure states or density matrices. For an arbitrary 
state vector atp- -\- btp^ -\- cipi + dijj^ the symmetry of p re- 
quires either a = Oorb = c~d^O. We observe that -(/;_ 
switches sign under the symmetry transformation as char- 
acteristic for a state consisting of two identical fermions. 
In contrast, the boson wave function ip = bip+ + cipi + dip2 
is invariant under the "particle exchange symmetry". 

XIV. PROBABILISTIC REALISM 

We have explicitly constructed a classical statistical set- 
ting which realizes all laws of quantum mechanics. This 
construction is independent of the conceptual and philo- 
sophical interpretations of quantum mechanics. Neverthe- 
less, it may have important conceptual consequences. In 
this section we argue that our setting is not in contradic- 
tion with physical realism, nor with locality in the sense 
that no signals traveling faster than light are needed. The 
quantum statistical systems are characterized, however, by 
a property of statistical "incompleteness" , in the sense that 
joint probabilities cannot be used for the prediction of out- 
comes of measurements of arbitrary pairs of observables. 
This "incompleteness" is intrinsic for quantum systems - 
possible additional, more complete statistical information 
about joint probabilities is irrelevant for the outcome of 
measurements of the quantum observables. It can only 
specify some information about the "environment" of the 
quantum system. Statistical completeness for all observ- 
ables, which means the availability and use of joint prob- 
abilities for measurement correlations of all pairs of ob- 
servables, implies Bell's inequalities and therefore contra- 
dicts the observational evidence for quantum correlations. 
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"Local hidden variable theories" usually assume statistical 
completeness. Such theories are not compatible with our 
setting. 

From our point of view the most general description of 
physical reality is genuinely probabilistic (22| . Statements 
about reality concern expectation values and measurement 
correlations for observables. They are assumed to be, in 
principle, independent of the observer - the physical reality 
of correlations exists independently of an observer looking 
at them or not. In view of the presence of correlations in 
the cosmic microwave background emitted about 400 000 
years after the big bang and concerning wavelengths of the 
size of the observable universe, it may indeed reasonably 
be assumed that such correlations are independent of a 
possible observation. This does not exclude that in some 
particular cases the correlations depend on the experimen- 
tal setup - after all, the apparatus is part of the physical 
reality. 

We may quote the EPR-criterion 1] for the existence of 
an element of physical reality: "if, without in any way dis- 
turbing the system, we can predict with certainty (i.e. with 
probability equal to unity) the value of a physical quantity, 
then there exists an element of physical reality correspond- 
ing to this physical quantity" . In the conceptual setting 
of "probabilistic realism" statistical correlations should be 
considered as possible "elements of physical reality". We 
will argue that in the typical EPR-case of two spatially sep- 
arated spins, which are emitted from a spinless source and 
therefore have opposite directions, the physical reality con- 
cerns the maximal anticorrelation of the spins rather than 
the value of the spin of one of the particles. The essential 
statement of a physical theory describing the reality is then 
that the signs of the spins are opposite. This differs from 
the usual approach, where it is argued that at the moment 
when one of the spins is measured the value of the second 
spin is physical reality, and the second spin must therefore 
have this value even before the measurement if no signals 
from the measurement of the first spin can reach the sec- 
ond one. As is well known, this implies Bell's inequalities 
and leads to contradiction with quantum mechanics. In 
our view, the only element of physical reality which exists 
before the measurement is the maximal anticorrelation be- 
tween the two spins. It is a priori not fixed if the necessary 
element of reality should refer to values of the spins or to 
correlations. In the present case it concerns the correlation. 

Of course, if one spin is measured, the value of the sec- 
ond one is fixed in consequence. After the measurement of 
one of the spins, one may eliminate all possibilities contra- 
dicting this measurement. This corresponds in quantum 
mechanics to the reduction of the wave function. We em- 
phasize that this needs no exchange of signals and no fixed 
value of the second spin before the measurement. We could 
omit the reduction of the wave function, which is a pure 
tool of convenience, and only describe measurement corre- 
lations of events for the original wave function. A physical 
theory needs, of course, a specification how this correlation 
is calculated - in our approach as the conditional correla- 
tion in terms of the conditional probabilities. 

Correlated systems cannot be separated into subsystems 



for which predictions can be made using only the informa- 
tion available in the subsystems. This is basic knowledge 
in any statistical system. In our approach it applies to the 
system of two spatially separated spins. Despite their sepa- 
ration, they cannot be treated as two independent systems 
of one particle with spin each. This would neglect the cor- 
relation. If one tries to do so nevertheless, one runs into 
conceptual contradictions. Some of the intuitive puzzles for 
the quantum mechanical system of two particles with total 
spin zero arise from the tendency to treat one of the parti- 
cles as an isolated subsystem if it is separated sufficiently 
far from the other particle. However, due to the existence 
of correlations, the system always needs to be treated as a 
whole, even for arbitrarily large separation of the particles. 

The possibility of nonlocal correlations is well known in 
statistical physics. This means that observables can be 
correlated even if they concern spatially separated regions 
and no signals can be exchanged between these regions. 
An example are macroscopic correlations between spins in 
a ferromagnet somewhat above the critical temperature, 
where the correlation length can reach a macroscopic size. 
Perhaps even simpler is the phenomenon of order. If the 
domains of magnetization are large enough, the measure- 
ment of the mean spin orientation in one region of the 
domain allows one to predict immediately the mean spin 
orientation in other regions of the domain. Therefore a 
correlation can be predicted even if measurements of the 
spin orientation are spatially separated in the sense that no 
signal can propagate between the different regions during 
the time of the measurement. 

Of course, the correlations must have been generated by 
local physical processes in the past. The original adjust- 
ment of the mean values of the spins into a given direction 
(within one of the ordered domains) must have proceeded 
by exchange process which can propagate at most with the 
speed of light. The analogue for the cosmic microwave 
background is the formation of correlations for metric fluc- 
tuations during the inflationary phase, which only later get 
separated to distances where signal exchange is no longer 
possible. Precisely the same happens for the spin correla- 
tions in the EPR-system with two particles with total spin 
zero. The correlation originates from the time of the decay 
of some spinless particle. Its persistence at later time is 
then simply a consequence of the conservation of angular 
momentum. 

What is then different between the correlations in quan- 
tum mechanics and the usual correlation in classical sta- 
tistical systems, say in thermodynamics? The central is- 
sue concerns the question of completeness of the statistical 
system, rather than issues of locality or reality. We call a 
statistical system "complete" if joint probabilities are de- 
fined for all pairs of observables, and if the measurement 
correlations for all pairs of observables are predicted by the 
joint probabilities 

(AB)„-^a6p,f,. (172) 

Here a and b are the possible measurement values of the 
observables A and B, respectively, and Pab denotes the joint 
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probability that the measurement of A yields a and the 
measurement of B yields b. We note that the property of 
completeness depends on the set of possible observables of 
the system. 

It can be shown that eq. (|172[) implies Bell's inequali- 
ties [3| , ) 0] , ■ concludes that quantum statistical 
systems must be incomplete statistical systems. The mea- 
surement correlation is not given by eq. (|172p . as we have 
already argued in sect. IVIII In contrast, for a deterministic 
"local hidden variable theory" one assumes the existence of 
some set of hidden variables A, which determine the values 
of the observables A and B as a(A) and b{X), respectively. 
Furthermore, it is assumed that for all A S A a probability 
measure dp, with J^dp = 1, is defined such that 

(AB)„, = / dpai\)bi\). (173) 

JA 

Eq. (|173p implies Bell's inequalities. Local hidden variable 
theories are complete statistical systems. Indeed, we may 
denote by Aab the regions in parameter space for which the 
observable A takes values in the internal [a — da, a + da], 
and similarly B is found in the internal [b — dbb + Sb] , with 
da, Sb specifying the "resolution" . The joint probability is 
then given by 

Pab^ [ dp (174) 

and eq. ((T75)) implies eq. ((Tf^ for |<5a|, ^ 0. (For a 
discrete spectrum of A and B one does not need 5a, Sb, in 
this case dp is directly given by Pab ) 

In our formalism with substates r we could consider r as 
hidden variables, since for every r one has fixed values of 
the observables At, Br (corresponding to a(A),5(A)). The 
probability Pr to find the substate r specifies the joint prob- 
ability. (In case of several states with the same Ar, Br one 
has to sum over all such states.) If the classical correlation 

{A- B) ^^PrArBr (175) 

T 

would define the measurement correlation, Bell's inequal- 
ities would follow. In sect. IVIII we have argued, however, 
that this correlation is not appropriate for statistical sys- 
tems that describe isolated quantum systems since it mea- 
sures properties of the environment of the system together 
with system properties. 

On the level of microstates a the joint probabilities are 
not defined any longer. If the measurement correlation 
would be given by the probabilistic pointwise correlation 

(Ax B) (176) 

(7 

this would again imply Bell's inequalities. Again, we have 
argued that the probabilistic pointwise correlation is not 
appropriate for the description of measurements of pairs 
of quantum observables. In contrast, the conditional cor- 
relation, which predicts the outcome of pairs of measure- 
ments, does not imply Bell's inequalities. Now the joint 



probabilities Pab are not used for the prediction of the out- 
come of measurements, since either they are not defined 
(on the level of microstates), or they do not describe sys- 
tem properties but rather also involve details of the envi- 
ronment which are not measured by a "good quantum mea- 
surement" (on the level of substates). We have seen that 
the conditional correlations precisely describe the correla- 
tions in quantum mechanics. The choice of the appropri- 
ate correlation function for the prediction of the outcome 
of measurements of pairs of observables is crucial for the 
understanding of quantum mechanics. 



XV. CONCLUSIONS 

We have discussed classical statistical ensembles that ex- 
hibit all features of two-state and four-state quantum sys- 
tems. The quantum mechanical density matrix obtains by 
reduction of an infinity of classical micro-states to a few 
effective states. In turn, each micro-state can be obtained 
by a coarse graining of infinitely many substates. Most of 
the statistical information concerning the micro-states or 
substates is not needed for the description of the quantum 
system. It rather describes properties of the environment. 
All information relevant for the quantum system is retained 
by a "coarse graining" to a small number of effective states. 

The minimal number of effective states depends on the 
observables which can describe an isolated (or approxi- 
mately isolated) partial system as, for example, an atom 
in its environment. The expectation values of all observ- 
ables of the partial system can be computed from the "ef- 
fective probabilities" of the effective states which, in turn, 
are given by expectation values of suitable basis observ- 
ables. We have constructed a density matrix from these 
expectation values. It has all the properties of the den- 
sity matrix in quantum mechanics. In particular, the ex- 
pectation values of observables of the partial system obey 
(A) =tT{Ap), precisely the law of quantum mechanics. We 
have explicitly constructed the quantum mechanical oper- 
ators A associated to classical spin observables A. They do 
not commute. 

For pure states with trp^ — 1 one can "take the root" of 
the density matrix by introducing the quantum mechanical 
wave function ip in the usual way, with (A) = i/;^ Aip. The 
formalism of quantum mechanics, with probability ampli- 
tudes, superposition of states and interference is recovered. 
The quantum mechanical wave function appears here as 
a derived quantity rather than the fundamental object in 
quantum mechanics. 

For two-state quantum mechanics the time evolution of 
the classical probability distribution is equivalent to the 
unitary time evolution of the density matrix only if the 
purity of the ensemble is conserved. This condition is gen- 
eralized to quantum systems with more than two states in 
[l3 . [Tsj . The unitary time evolution of the density matrix 
should be interpreted as a perfect isolation of the partial 
system described by the observables. As usual, a unitary 
evolution is described by a Hamilton operator H. Since H 
is the generator of time translations it should correspond 
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to the energy of the isolated partial system by virtue of the 
Noether theorem. A unitary time evolution of a pure state 
is described by the Schrodinger equation for t/i. 

Our construction can be extended beyond the two-state 
and four-state quantum systems. For M quantum states 
the manifold of micro-states parameterized by fk corre- 
sponds to the homogeneous space SU{M)/SU{M — 1) x 
U{\), while the discussion of this paper was mainly re- 
stricted to M = 2 where SU{2)/U{\) parameterizes the 
sphere . Furthermore, an explicit discussion of the 
phenomena of entanglement, superposition and interfer- 
ence within classical statistics is given for M = 4 in 
sect. IXIIIl For identical "particles" this also accounts 
for the difference between fermions and bosons. We ob- 
serve that the restriction to a manifold of micro-states 
SU{M)/SU{M - 1) X [/(I) is not necessary. The latter 
is simply the minimal manifold needed in order to imple- 
ment an unitary continuous time evolution. One may em- 
bed this manifold into a larger manifold of classical states. 
Then it appears as a projection of the larger ensemble on 
the minimal manifold of micro-states. The probability dis- 
tribution on the minimal manifold of micro-states carries 
all the information needed for the expectation values of the 
observables of the "isolated system" , plus irrelevant addi- 
tional information if the state is mixed. 



The unitary time evolution of quantum mechanics ap- 
pears as a special case of a wider class of time evolutions 
of the classical ensemble. We argue that the special case of 
the unitary evolution of pure states corresponds to a partial 
fixed point of the more general evolution equations. The 
general time evolution of the classical ensemble can also ac- 
count for the phenomenon of decoherence, corresponding 
to decreasing purity, and "syncoherence" for the increase 
of purity as the pure state fixed point is approached. This 
shows that the classical ensemble can describe an incom- 
pletely isolated quantum system embedded in its environ- 
ment, with quantum mechanics as an idealization where 
the isolation becomes perfect. 

In our picture, an atom and its environment are de- 
scribed by a classical statistical ensemble with infinitely 
many degrees of freedom. If a gas of atoms is dilute enough 
the picture of an isolated atom becomes a good approxima- 
tion. Such an isolated atom can be described by a few ob- 
servables out of the infinitely many possible observables of 
the whole system. The expectation values and correlations 
of these observables can be computed by a reduction to ef- 
fective states, with "effective probabilities" mirrored in the 
density matrix. The limit of perfect isolation is described 
by a unitary time evolution - this is quantum mechanics. 
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